Figure 22.6. A histogram of observed and expected counts for a G-test. The
illustrated histogram represents the observed counts for a sample of 100 values that have
been binned into regions of width 0.05. The expected counts, represented by black spots,
are calculated from the probability density funtion of the normal distribution, to which the
histogram approximates. For the G-test the different bins are treated as independent
random variables, and overall we estimate the goodness of fit of the sample histogram to
the normal expectation. It should be noted that because some counts are small (<5) it
would be inappropriate to do the chi-square test.
We can illustrate this for the heights of people, using a null hypothesis with a normal
distribution, and then test the goodness of fit for different height categories. Here bins
represents the centres of the histogram ranges and obsd contains the corresponding
number of observed counts for each bin. The expected counts expd are calculated for each
of the bins values by applying the .pdf() for the normal distribution (with appropriate
mean and standard deviation) and then scaling so that the total is the same as the number
of observations.
from scipy.stats import norm
from numpy import array
bins = array([1.65, 1.7, 1.75, 1.8, 1.85,])
obsd = array([ 14, 15, 33, 22, 8,])
mean = 1.76
std = 0.075
nObs = obsd.sum()
expd = norm.pdf(bins, mean, std)
expd *= nObs / expd.sum()
# Expected counts: 9.196, 19.576, 26.720, 23.385, 13.122
chSqStat, pValue = chisquare(obsd, expd)
print('Chi square A', chSqStat, pValue)
The result is a chi-square statistic of 7.14 and corresponding p-value of 0.129 (12.9%).
Hence, we would not reject the null hypothesis. As an alternative method we could use the
SciPy distribution chi2 and do the statistical test for a one-tailed value as we illustrated
previously, noting that here we have to explicitly specify the number of degrees of
freedom for the chi-square distribution (which is one fewer than the number of
categories):
from scipy.stats import chi2
degFree = len(obsd)-1
pValue = chi2.sf(chSqStat, degFree) # Result is 0.129
Not only can the chi-square test be applied to discrete categorical counts as we illustrate
above, but the process can also be applied to the fundamental parameters of a probability
distribution. For example, the null hypothesis may be a specific probability distribution
which can be described by a few fundamental parameters (μ, σ etc.) and we can compare
that with the alternative hypothesis where the parameters are unrestricted. In the case of
mean and standard deviation there would be three degrees of freedom in the test (two of
which come from the fundamental parameters).
Pearson’s chi-square test is commonly used, and thus implemented in SciPy, but there
are limitations to its use, especially concerning low counts. A popular heuristic is that at
least five observed counts are required for each random variable (each category).
Although corrections can be applied to account for this, in general where counts are small
it may be easier to apply a different test. Indeed the chi-square statistic is really only an
approximation, developed in the days before computers for ease of calculation, to the
likelihood ratio statistic. In the modern computer age we have little problem implementing
likelihood ratio tests, such as the G-test that we describe next, which is less sensitive to
small counts. The G-test uses the following statistic:
The statistic is a more accurate version of chi-square so we still do statistical tests using
the chi-square distribution with the same degrees of freedom as the equivalent chi-square
test. The G-statistic is not only useful for deriving probabilities, but also has a useful
meaning in itself. It is the same thing as twice the Kullback-Leibler divergence
15
or
relative entropy: a measure of the extra information required to get the observed
distribution from the expected distribution. Unfortunately the G-test is not implemented
directly in a single SciPy function, but we can easily do the required steps, making use of
the chi-square distribution’s survival function (chi2.sf()) to obtain a p-value. Firstly, we
gather some test data, which is similar to before, although we have included a few more
histogram bins with small counts. Then we calculate the expectation from the normal
distribution:
bins = array([1.55, 1.6, 1.65, 1.7, 1.75, 1.8, 1.85, 1.90])
obsd = array([ 3, 4, 14, 15, 33, 22, 8, 1])
mean = 1.76
std = 0.075
nObs = obsd.sum()
expd = norm.pdf(bins, mean, std)
expd *= nObs/expd.sum()
#Expected: 0.535, 2.769, 9.194, 19.571, 26.713, 23.379, 13.119, 4.720
The G-test is easily calculated for the arrays by taking the natural logarithm (base e) of
the required ratio, multiplying by the observations and then finding twice the sum of the
resulting array. The resulting G-value is then applied to the chi-square survival function,
with the appropriate degrees of freedom:
from numpy import log
g = 2.0 * sum(obsd * log(obsd/expd))
degFree = len(obsd)-1
pValue = chi2.sf(g, degFree)
print('G test', g, pValue) # Result is: 17.34, 0.015
As illustrated, the result is a one-tailed probability of 1.5%. If we compare this with the
chi-square test, even though it is not appropriate to do so with such small counts, we get:
chSqStat, pValue = chisquare(obsd, expd)
print('Chi-square', chSqStat, pValue) # Result: 21.98, 0.00256
With the chi-square test the probability is now estimated to be 0.26%, which is
substantively different from the G-test value.
Do'stlaringiz bilan baham: |