module, tested for 10 million trials with event probability 0.00025. The graph illustrates
the probability for discrete numbers of events in the range from 2300 to 2700, covering
the mean value at 2500.
Figure 21.8. Example output of the Binomial cumulative distribution function. A
graph of the output generated using the bionom.cdf() function from the scipy.stats module,
tested for 10 million trials with event probability 0.00025 and illustrating the cumulative
probability density for discrete numbers of events in the range from 2300 to 2700,
covering the mean value at 2500.
Poisson distribution
If we know the average rate at which an event occurs, over a large number of
independent trials, then the Poisson distribution is the probability distribution of the
number of events that occur in a time interval. This is closely related to the binomial
distribution, but specifying the rate (λ) at which the event occurs means we don’t specify
the number of trials (n) or the probability of an event (p), though the rate λ is essentially p
× n. The Poisson distribution would be used instead of the binomial distribution in
situations where the number of trials is not measurable. For example, as we illustrate
below, where statistically we observe the average rate of births in a population per day, we
can calculate the probability distribution of the number of births per day without knowing
the size of the population. The binomial distribution approaches the Poisson distribution as
the number of trials (n) becomes large and the event probability (p) becomes small.
For the Poisson distribution the equation for the probability of observing k events given
an occurrence rate of λ from independent trials is:
Here e is the mathematical constant
≈2.71828 (Euler’s number). It can be shown that the
mean of the Poisson distribution is λ and the variance (see Chapter 22) is also λ.
We can implement the Poisson distribution using the scipy.stats module, which is quick
and robust, compared to calculating the factorials and powers explicitly in basic Python.
For an example where in a hospital there is an average of 4.7 births per day the Poisson
distribution estimates the probability of observing a given number of births as follows:
from scipy.stats import poisson
poissRandomVar = poisson(4.7)
for k in range(10):
pk = poissRandomVar.pmf(k)
print('Number of births: %2d probability: %.3f' % (k, pk))
We can apply the distribution to the restriction enzyme example we used above, which
shows that for large numbers of trials and small event probabilities the Poisson
distribution is a very good approximation for the binomial distribution.
from scipy.stats import poisson
from numpy import array
rate = 10000000 * 0.00025
poissRandomVar = poisson(rate)
counts = array(range(2300, 2700))
probs = poissRandomVar.pmf(counts)
pyplot.plot(counts, probs)
pyplot.show()
Do'stlaringiz bilan baham: