Python Programming for Biology: Bioinformatics and Beyond



Download 7,75 Mb.
Pdf ko'rish
bet324/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   320   321   322   323   324   325   326   327   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Binomial distribution

Given  an  event  with  a  fixed  probability  of  occurrence,  the  binomial  distribution  is  the

probability  distribution  of  the  number  of  events  that  occur  after  a  specified  number  of

independent trials. A simple example of this would be the event of rolling a six on a die,

i.e.  with  probability

1

/

6



,  where  after  a  specified  total  number  of  rolls  we  can  count  the

number of times that a six came up. Repeating the same experiment (with the same total

number  of  rolls)  will  result  in  a  distribution  of  different  counts  for  rolling  a  six.  The

probability of getting a given count of sixes is described by the binomial distribution. For

a  given  event  probability  and  given  number  of  trials,  the  probability  of  a  count  can  be

calculated  using  the  formula  presented  below.  This  is  based  on  the  notion  that  the

probability of a count depends on the number of arrangements in which the count can be

obtained.  To  take  the  example  of  rolling  a  die  three  times,  where  there  are  216  (6×6×6)

possible outcomes, there is only one way of getting a count of three sixes, but there are 15

ways  of  getting  two  sixes  (a  non-six  can  occur  in  three  positions,  and  there  are  five

possibilities for each), 75 ways of getting one six (a six can occur at three positions and

there  are  five  times  five  possibilities  for  the  non-sixes)  and  125  ways  of  getting  no  six

(five possibilities for each position).

The  probability  Pr(k)  of  observing  k  events  from  n  independent  trials  given  event

probability p is:

6

This is often written using



, which is notation for the combinatorial factor, giving the

number of ways of choosing k items from a total of n:

If we seek the probability of getting two sixes from three rolls we multiply the probability

of getting two sixes, p

k

= 1/6


2

, by the probability of getting a non-six in the other rolls, (1

p)

n  −  k

 =  (5/6)

3−2

,  by  the  number  of  ways  of  choosing  two  successes  from  three  rolls,



, and the result is indeed 15/216.

We can define a function to calculate this in Python, using the handy comb, which we

can import from SciPy to calculate the combinatorial factor:



from scipy.misc import comb

def binomialProbability(n, k, p):

return comb(n, k) * p**k * (1-p) ** (n-k)

To test this we can again calculate the probability of getting two sixes from three rolls

of a die:

p = 1/6.0 # Probability of event

n = 3 # Number of trials

k = 2 # Number of events sought

print( binomialProbability(n, k, p) )

# Result is 0.069444444 = 15/216

As  a  biological  example  we  could  investigate  the  distribution  in  the  number  of

sequencing errors (i.e. calling the wrong nucleotide) we expect when determining a DNA

sequence of a given length. If the sequencing machine has a random error rate of 0.01 and

reads  the  sequence  for  a  total  of  100  nucleotides,  then  the  distribution  of  the  number  of

errors can be plotted as follows:

from matplotlib import pyplot

p = 0.01

n = 100


xVals = []

yVals = []

for k in range(7):

pk = binomialProbability(n, k, p)

xVals.append(k)

yVals.append(pk)

pyplot.plot(xVals, yVals)

pyplot.show()

This (plotted in

Figure  21.6

)  shows  that  although  the  expectation  is  to  have  one  error

every 100 nucleotide positions, around 36% of the time there will be no errors.





Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   320   321   322   323   324   325   326   327   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish