Python Programming for Biology: Bioinformatics and Beyond

Figure 21.7. Example output of the binomial distribution for a large number of

Download 7,75 Mb.

Pdf ko'rish

bet	326/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 322 323 324 325 326 327 328 329 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Figure 21.8. Example output of the Binomial cumulative distribution function.
Poisson distribution

Figure 21.7. Example output of the binomial distribution for a large number of

trials. A graph of the output generated using the binom.pmf() function from the scipy.stats

module, tested for 10 million trials with event probability 0.00025. The graph illustrates

the probability for discrete numbers of events in the range from 2300 to 2700, covering

the mean value at 2500.

Figure 21.8. Example output of the Binomial cumulative distribution function. A

graph of the output generated using the bionom.cdf() function from the scipy.stats module,

tested for 10 million trials with event probability 0.00025 and illustrating the cumulative

probability density for discrete numbers of events in the range from 2300 to 2700,

covering the mean value at 2500.

Poisson distribution

If we know the average rate at which an event occurs, over a large number of

independent trials, then the Poisson distribution is the probability distribution of the

number of events that occur in a time interval. This is closely related to the binomial

distribution, but specifying the rate (λ) at which the event occurs means we don’t specify

the number of trials (n) or the probability of an event (p), though the rate λ is essentially p

× n. The Poisson distribution would be used instead of the binomial distribution in

situations where the number of trials is not measurable. For example, as we illustrate

below, where statistically we observe the average rate of births in a population per day, we

can calculate the probability distribution of the number of births per day without knowing

the size of the population. The binomial distribution approaches the Poisson distribution as

the number of trials (n) becomes large and the event probability (p) becomes small.

For the Poisson distribution the equation for the probability of observing k events given

an occurrence rate of λ from independent trials is:

Here e is the mathematical constant

≈2.71828 (Euler’s number). It can be shown that the

mean of the Poisson distribution is λ and the variance (see Chapter 22) is also λ.

We can implement the Poisson distribution using the scipy.stats module, which is quick

and robust, compared to calculating the factorials and powers explicitly in basic Python.

For an example where in a hospital there is an average of 4.7 births per day the Poisson

distribution estimates the probability of observing a given number of births as follows:

from scipy.stats import poisson

poissRandomVar = poisson(4.7)

for k in range(10):

pk = poissRandomVar.pmf(k)

print('Number of births: %2d probability: %.3f' % (k, pk))

We can apply the distribution to the restriction enzyme example we used above, which

shows that for large numbers of trials and small event probabilities the Poisson

distribution is a very good approximation for the binomial distribution.

from scipy.stats import poisson

from numpy import array

rate = 10000000 * 0.00025

poissRandomVar = poisson(rate)

counts = array(range(2300, 2700))

probs = poissRandomVar.pmf(counts)

pyplot.plot(counts, probs)

pyplot.show()

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 322 323 324 325 326 327 328 329 ... 514