Detection of chemical concentration trends in environmental contaminants is a critical step in assessing the environmental condition of a given system. For example, positive identification of contaminant concentration trends can assist in both proving plume migration and demonstrating evidence of groundwater contaminant degradation. When considering degradation of a contaminant, conclusive demonstration of the movement of a particular contaminant (e.g. petroleum hydrocarbons) into an area, followed by a subsequent decrease in oxygen concentration (and other potential electron acceptors), while showing an increase in reaction by-products, may be needed to prove contaminant degradation. Each of these individual trends alone may not adequately support a claim of biologically based degradation, but the sequential appearance of a biodegradable contaminant, disappearance of an appropriate electron acceptor, and development of reaction end-products is increasingly being accepted as evidence of in situ remediation.
One of the difficulties encounter in the interpretation of environmental field data is the quantification of trends (e.g. calculation of slope) and demonstration that this estimation of trend is statistically different from zero. The focus here is on one non-parametric method used in determining the presence of slope and is known simply as Sen's Nonparametric Estimator of Slope. Both the methodology and an example (using an artificial groundwater data set) for Sen's method is provided.
Several tests are available for the detection and/or quantification of trends. The first step in analyzing any data set, however, is to graph the data, usually as a function of space or location. Graphical representations of data facilitates observation of general trends and cycles which may assist in the selection of an appropriate statistical test.
Table 1 summarizes various methods for detecting and/or estimating trends using various techniques. Each technique has advantages and disadvantages, so care should be taken to carefully examine the type and volume of data collected before selecting a particular technique.
| Test Procedure | Applicability | Notes | Reference(s) |
|---|---|---|---|
| Graphical Methods | Visual estimate of trend presence/absence | No quantifiable results | |
| Linear Regression | Provides an estimate of slope, confidence interval, and quantifies goodness of fit | Allows quantified estimate of influence of
multiple independent variables Does not handle missing data Does not handle BD measurements May be greatly affected by outliers and cyclic data |
|
| Box-Jenkins Model | Test for trends in long term, regularly spaced data | Requires large data set Requires constant temporal spacing of data sets |
Box and Jenkins (1976) |
| Mann-Kendall | Yes/No test for existence slope |
Non-parametric test Allows missing data Allows reporting of levels BD Not affected by gross data errors and outliers |
Mann (1945) Kendall (1980) |
| Sen's Method | Estimates value and confidence interval for slope | Allows missing data Makes no assumptions on distribution of data Not affected by gross data errors and outliers |
Sen (1968) Thiel (1950) |
Sen's method for the estimation of slope requires a time series of equally
spaced data. Sen's method proceeds by calculating the slope as a change
in measurement per change in time, as shown here in Equation (1) and
Table 2 for the simple case of one data measurement
per time spacing.
(1)
where: Q = slope between data points xi' and xi
xi'
= data measurement at time i'
xi
= data measurement at time i
i'
= time after time i
| Time Data |
1 X1 |
2 X1 |
3 X1 |
... | 5 X1 |
T XT |
|---|---|---|---|---|---|---|
| 0 |
|
X2-X1 2-1 |
X3-X1 |
... |
XT-1-X1 |
XT-X1 T-1 XT-X2 T-2 XT-X3 T-3 : XT-XT-2 2 XT-XT-1 1 |
If multiple data measurements are collected at a given time, two options exist if multiple measurements are recorded for a given time step. The first option is to simply combine the measurements for a given time step into a single measurement of central tendency (e.g. mean, median) and proceed as above. The second option is to calculate a slope for each individual measurement, as shown in Table 3 below. Note that the slope between measurements collected at the same time is not calculated.
| Time Data |
1 X1,1 |
1 X1,1 |
2 X2,1 |
2 X2,2 |
2 X2,3 |
3 X3,1 |
... | T XT,J-1 |
|---|---|---|---|---|---|---|---|---|
|
|
|
NC | X2,1-X1,1 2-1 X2,1-X1,2 2-1 |
X2,2-X1,1 2-1 X2,2-X1,2 2-1 NC |
X2,3-X1,1 2-1 X2,3-X1,2 2-1 NC NC |
X3,1-X1,1 3-1 X3,1-X1,1 3-1 X3,1-X2,1 3-2 X3,1-X2,2 3-2 X3,1-X2,3 3-2 |
... | XT,J-X1,1 T-1 XT,J-X1,2 T-1 : : : : XT,J-XT-1,J-1 1 NC : NC |
Upon calculation of slope by either method outlined above, Sen's estimator
of slope is simply given by the median slope, shown below as:
Sen's
Estimator of Slope = median slope = Q'
=
Q[(N'+1)/2] if
N' is odd, (2)
=
( Q[N'/2] + Q[(N'+2)/2])/2 if
N' is even
where: N' = number of calculated slopes
Sen's Method also allows determination of whether the median slope is statistically different from zero. A confidence interval is developed by estimating the rank for the upper and lower confidence interval and using the slopes corresponding to these ranks to define the actual confidence interval for Q'. For a two-sided confidence interval about the median slope, first find the Zstatistic for a two-tailed normal distribution test. For example, if a two-sided confidence interval of 95% is desired, find Z(1-0.05/2) = Z0.975 = 1.96. Next, estimate the variance of the Mann-Kendall statistic (VAR(S)) as developed by Kendall (1975):
where: n = number of data points
tp
= the number of ties for the pth value
q
= the number of tied values
Gilbert (1987) notes that Equation 3 is valid for all n>40, while Kendall indicates that Equation 3 may be used for n between 10 and 40 as long as there are not many tied data pairs. To estimate the range of ranks for the specified confidence interval, find C using:
(4)
Using the value of Equation 4, find the ranks of the lower (M1) and upper (M2 + 1) confidence limits using:
(5)
Finally, choose the slopes corresponding to M1 and M2+1 as the lower and upper confidence limits, respectively. Note that the median slope is then defined as statistically different from zero (for the selected confidence interval) if the zero does not lie between the upper and lower confidence limits.
Potential Difficulties
Missing Data: For Sen's test, simply do not calculate a slope for the missing data point (making sure not to count missing data points in with the total number of samples, n). If large amounts of data are missing, Sen's method is not recommended.
No Detection (ND) or Trace Data: Sen's method may still be used to predict a median slope if the number of ND measurements is less than (n-1)/2, but may severely limit the prediction of a confident interval about this estimate. Gilbert recommends setting BD measurements equal to 1/2 the detection limit and proceeding with calculation of individual slopes. Also note that implementing Sen's test with ND may severely impair the ability to predict confidence intervals.
For this example, let us assume that we are trying to prove that a toxic subsurface contaminant is degrading in the presence of oxygen. Let us also assume that the biochemical degradation reaction produces a known reaction end product. Given the fictitious field measurements tabulated below (Figure 1) for the concentration of contaminant, dissolved oxygen, and reaction end product, show that the following trends are statistically significant:

Step 1. Calculate slope for each data point .
For Sen's method, a slope is shown to be statistically different from zero if zero does not exist within a two-sided confidence interval about the median slope estimate. For this example, Equation 1 or the first equation in Table 2 may be used to calculate slope for contaminant concentration change from March 1991 (Time =1) to June 1991 (Time = 2). The slope is given by:
.
Similarly, the calculated slopes for the December 1991 data set (Column 6 of Table 2) are calculated as:
.
Note that the data reported in Figure 1 is provided on a quarterly basis, and thus the temporal spacing of the data has been translated into equal time periods (corresponding to a yearly quarter) and the slopes have the units of mg/L/quarter-year. If the dates of collection had been provided, individual slopes could have been calculated as change in concentration per day.
Continued calculation of individual slopes leads to the values compiled for the contaminant in Table 4, dissolved oxygen in Table 5, and the reaction end product in Table 6. Note that the red values in these tables indicate a decreasing slope, the blue values indicate an increasing slope, and the black values indicate no change in slope.



Step 2. Rank the calculated slopes (listed in Tables 4, 5, & 6). Determine the median slope, Q', for each compound.
Median slopes for the various compounds are given by:
.
Step 3. Test Sen's Estimate of Slope for statistical difference from zero for a 90% confidence interval.
Step 3.1 Calculate the VAR(S) based on the Mann Kendall test for n greater than 40. (See Equation 3)
For the contaminant: ![]()
Step 3.2 Calculate the z statistic for the desired confidence level:
For 90% Confidence:
![]()
Step 3.3 Multiply Z0.95 by VAR(S) from Step 1.
For the contaminant: ![]()
Step 3.4 Compute M1 and M2
For the contaminant: 
Step 3.5 Select the lower and upper confidence levels as the slopes with rank M1 and M2+1, respectively.
For the contaminant: 
Step 4. Results of Example.
As shown in Table 7, the median contaminant and dissolved oxygen slopes are negative (decreasing) and the median slope of the reaction end product is increasing. Furthermore, these slopes are shown to be different from zero for a 90 percent confidence interval. Thus, all three of the original hypotheses have been met and the conclusion can be made that the contaminant is degrading while dissolved oxygen is decreasing and reaction end product are increasing.

|
Sampling & Monitoring Primer Table of Contents |
Previous Topic |
Next Topic |
Send comments or suggestions to:
Student Author: J. Steven Brauner, sbrauner@vt.edu
Faculty Advisor: Daniel Gallagher, dang@vt.edu
Copyright © 1997 Daniel Gallagher
Last Modified: 2May1997