Python Programming for Biology: Bioinformatics and Beyond

Figure 21.5. Combining probabilistic events

Download 7,75 Mb.

Pdf ko'rish

bet	320/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 316 317 318 319 320 321 322 323 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Figure 21.5. Combining probabilistic events. The first event, that one nucleotide from

the 16 pairs contains an A, and the second event, that the nucleotides are different, are

subsets of the total set of outcomes. The intersection between the two events is the set of

outcomes common to both. Probabilities are calculated for the events assuming that all

outcomes are equally likely.

Something that follows from the basic axioms of probability is the notion that we can

use the probability of the intersection between events Pr(E

and E

) to calculate the

probability of the union between events Pr(E

or E

Pr(E

or E

) = Pr(E

) + Pr(E

) – Pr(E

and E

)

If there is an intersection between the event E

and the event E

adding the probabilities

for the two will include the overlapping outcomes twice, so subtracting the intersection

that both E

1

and E

happen redresses this. This way each outcome that involves E

or E

contributes the same. When considering mutually exclusive events the probability P(E

and E

) is naturally zero, in which case Pr(E

or E

) is just the sum of the independent

probabilities.

We can show the calculation of P(E

or E

) in Python by either creating the

appropriate set or by using the above equation:

union = event1 | event2 # Set with elements from both

pUnion = sum([probs[xy] for xy in union])

print(pUnion) # 0.81049

print(pEvent1 + pEvent2 - pEvent1and2) # 0.81049 - same

While we can treat combined dice rolls or DNA positions as discrete outcomes we can

also imagine these as arising from a chain of probabilistic selections. In the above

examples the trials are independent and the result of the first has no influence on the

second, which is reasonable for a fair die. However, for DNA (and many other analogous

situations in biology) the probabilities of the occurrence of a nucleotide at each position

may not only be different, as discussed before, but the probability for the second position

may also vary according to which base is present in the first position, or indeed many

other positions.

In this case we would say the positions were not independent and the probability of

observing the second nucleotide differs, depending on the outcome of the first. To

calculate the probability of getting each pair of nucleotides we get the probability of

obtaining the first nucleotide and multiply this by the probability of getting the second,

given the first. This is what is termed a conditional probability and in general we would

need to know what the probabilities for the four nucleotides were given each particular

preceding nucleotide.

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 316 317 318 319 320 321 322 323 ... 514