Normal Probability Graphs and Correlation Coefficient R
2
N
For a valid regression, the errors in the observations and the prior information used in the
regression need to be random and the weighted errors need to be uncorrelated (Draper and Smith,
1981). In addition, inferential statistics such as confidence intervals generally require that the ob-
servation errors be normally distributed (Helsel and Hirsch, 1992). The actual errors are unknown,
so the weighted residuals are analyzed. If the model accurately represents the actual system and
the observation errors are random and the weighted errors are independent, the weighted residuals
are expected to either be random, independent, and normally distributed, or have predictable cor-
relations. The first step is to determine whether the weighted residuals are independent and nor-
mally distributed. If they are not, further analysis is needed to determine if the violations are
consistent with the expected correlations.
The test for independent, normal weighted residuals is conducted using normal probability
graphs of weighted residuals. Such graphs can be constructed as discussed by Hill (1994), using
files created by UCODE or MODFLOWP. The files are designed so that the graphs can be con-
structed using commonly available x-y plotting software using arithmetic axes. If the weighted re-
siduals are independent and normally distributed, they will fall on an approximately straight line
in the normal probability graph. The associated summary statistic is R
2
N
the correlation coefficient
between the weighted residuals ordered from smallest to largest and the order statistics from a
N(0,1) probability distribution function (Brockwell and Davis, 1987, p. 304). This statistic tests for
independent, normally distributed weighted residuals and was chosen instead of other statistics,
such as chi-squared and Kolomogorov-Smirnov, because it is more powerful for commonly used
sample sizes (Shapiro and Francia, 1972). The correlation coefficient is calculated as:
R
2
N
=
,
(25)
where all vectors are of length ND for R
2
N
evaluated only for the observation weighted residuals,
and length ND+NPR for R
2
N
evaluated for the observation and prior information weighted resid-
uals; m is a vector with all components equal to the average of the weighted residuals, e
o
is a vector
of weighted residuals ordered from smallest to largest, and
τ
is a vector with the ith element equal
to the ordinate value of a N(0,1) probability distribution function for a cumulative probability equal
to u
i
= (i-0.5)/ND. A normal probability table (as in Cooley and Naff, 1990, p. 44, or any standard
statistics text) can be used to determine that, for example, if u
= 0.853l, then
τ
i
= 1.05. UCODE
and MODFLOWP print the ordered weighted residuals of e
o
and R
2
N
.
If R
2
N
is too much less than its ideal value of 1.0, the weighted residuals are not likely to
e
o
m
–
(
)
T
τ
[
]
2
e
o
m
–
(
)
T
e
o
m
–
(
)
[
] τ
T
τ
(
)
---------------------------------------------------------------
24
be independent and normally distributed. The critical values for R
2
N
for significance levels 0.05
and 0.10 are shown in Appendix D and the relevant critical values are printed by MODFLOWP
and UCODE with R
2
N
.
Do'stlaringiz bilan baham: |