Python Programming for Biology: Bioinformatics and Beyond


Restriction enzyme example



Download 7,75 Mb.
Pdf ko'rish
bet317/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   313   314   315   316   317   318   319   320   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Restriction enzyme example

Taking the DNA example a little further, let’s consider a restriction enzyme called HindIII

that is commonly used in molecular biology and which cuts DNA at the specific sequence

AAGCTT.  Using  a  simple  probabilistic  model  about  the  likelihood  of  finding  a  given

letter at a given position in an otherwise random DNA sequence, we can estimate various

properties, like how often the enzyme cuts or what the size of the fragments will be after

cutting.  A  DNA  sequence  actually  isn’t  totally  random,  but  the  approximation  is

nonetheless good enough to get useful predictions.

Assuming that the nucleotide at one position does not depend in any way on what the



nucleotides  are  at  the  other  positions  (i.e.  the  nucleotides  at  different  positions  are

independent),  we  can  calculate  the  probability  of  a  HindIII  site  at  any  six  residue  sub-

sequence to be Pr(A) × Pr(A) × Pr(G) × Pr(C) × Pr(T) × Pr(T). This is about one cut in

4096  (4


6

)  positions,  if  we  assumed  equal  probabilities  for  all  nucleotides.  Hence  for  a

DNA sequence of length N we would expect N ×

1

/



4096

restriction enzyme cut sites. Also,

on  average  we  could  expect  the  separation  to  be  about  4096  bases.  Calculating  the

probability of the cut site using the non-equal nucleotide probabilities calculated above we

get:

cutSite = 'AAGCTT'



probSite = 1.0 # Starting value

for letter in cutSite:

probSite *= letterProbs[letter]

print(probSite) # 0.00023637 – approx one in 4230

Because the occurrence of a site is effectively random we will expect a distribution of

different values for the number of cut sites in a given length and also for the lengths of the

fragments.  In  other  words  because  the  sites  are  random,  and  not  regular,  the  spacing

between sites will generally be more or less than 4230. We will consider models for the

shape of such probability distributions later in this chapter.


Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   313   314   315   316   317   318   319   320   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish