Python Programming for Biology: Bioinformatics and Beyond



Download 7,75 Mb.
Pdf ko'rish
bet183/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   179   180   181   182   183   184   185   186   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Profile alignments

Once we have calculated a profile for an alignment we can align it with something else,

just  as  we  do  in  a  pairwise  sequence  alignment.  We  actually  require  the  ability  to  align

pairs  of  profiles  for  the  demonstration  function  to  create  multiple  alignments.  When  we

create  a  multiple-sequence  alignment  by  progressively  adding  sequences,  we  have  to

combine  profiles  for  one  alignment  (one  branch  of  the  tree)  with  another  alignment  (a

different branch); we cannot do a plain sequence alignment.

We  won’t  go  through  the  profile  alignment  function  in  great  detail  because  it  is  so

similar to the regular sequence alignment function discussed in

Chapter 12

.  Nonetheless,

there  are  still  a  few  key  differences.  Firstly  and  most  obviously,  we  are  not  passing-in

sequences  (strings  of  letters),  but  rather  profiles  (lists  of  residue  fraction  dictionaries).

Secondly,  when  we  calculate  the  residue  similarity  scores,  rather  than  comparing  one

residue with another, we compare all of the residue fractions with each other. Accordingly,

say that at a given position we have 50% ‘A’ and 50% ‘G’ in one profile and 25% ‘C’ and

75% ‘G’ in the other profile, then the final similarity score comes from the addition of four

values; the combinations A:C, A:G, G:C and G:G each using a different similarity value

from the substitution matrix, multiplied by the weight for that pair. You can imagine the

weights  for  each  combination  being  the  area  within  a  square  defined  by  edges  that  have

lengths  defined  by  the  individual  profile  weights.  For  example,  the  A:C  combination  is

50%  times  25%,  giving  12.5%  to  the  score.  Thirdly,  the  function  also  modifies  the  gap

penalties  because  the  input  profiles  are  generated  from  alignments  that  may  carry  gaps.

The  modification  is  to  multiply  the  penalties  by  the  total  weight  for  the  profiles  at  each

point;  any  missing  weight  will  be  due  to  gaps  so  the  total  is  not  necessarily  100%.  This

multiplication  reduces  the  gap  penalties,  thus  gaps  are  penalised  less  when  gaps  were

present in the input profiles.

def profileAlign(profileA, profileB, simMatrix, insert=8, extend=4):

numI = len(profileA) + 1

numJ = len(profileB) + 1

scoreMatrix = [[0] * numJ for x in range(numI)]

routeMatrix = [[0] * numJ for x in range(numI)]

for i in range(1, numI):

routeMatrix[i][0] = 1

for j in range(1, numJ):

routeMatrix[0][j] = 2

for i in range(1, numI):

for j in range(1, numJ):

penalty1 = insert

penalty2 = insert

if routeMatrix[i-1][j] == 1:

penalty1 = extend




elif routeMatrix[i][j-1] == 2:

penalty2 = extend

fractionsA = profileA[i-1]

fractionsB = profileB[j-1]

similarity = 0.0

totalWeight = 0.0

for residueA in fractionsA:

for residueB in fractionsB:

weight = fractionsA[residueA] * fractionsB[residueB]

totalWeight += weight

similarity += weight * simMatrix[residueA][residueB]

penalty1 *= totalWeight

penalty2 *= totalWeight

paths = [scoreMatrix[i-1][j-1] + similarity, # Route 0

scoreMatrix[i-1][j] - penalty1, # Route 1

scoreMatrix[i][j-1] - penalty2] # Route 2

best = max(paths)

route = paths.index(best)

scoreMatrix[i][j] = best

routeMatrix[i][j] = route

profileOutA = []

profileOutB = []

i = numI-1

j = numJ-1

score = scoreMatrix[i][j]

while i > 0 or j > 0:

route = routeMatrix[i][j]

if route == 0: # Diagonal

profileOutA.append(profileA[i-1])

profileOutB.append(profileB[j-1])

i -= 1

j -= 1


elif route == 1: # Gap in profile B

profileOutA.append(profileA[i-1])

profileOutB.append(None)

i -= 1


elif route == 2: # Gap in profile A

profileOutA.append(None)

profileOutB.append(profileB[j-1])

j -= 1


profileOutA.reverse()

profileOutB.reverse()




return score, profileOutA, profileOutB

alignA = ['SRPAPVV--LII', 'TVPAPVVIILII']

alignB = ['HHFSEPEITLIIF', 'H-FSELVIALIIF']

print(profileAlign(profile(alignA), profile(alignB), BLOSUM62))

The  output  of  the  function  is  a  numeric  score  and  two  aligned,  gapped  profiles.  Note

that the output profiles were made by using the .append() function to add to the end of the

lists, even though  we  went  through  the  sequence  positions  in  reverse  order  (decreasing  i

and j). This results in the profile lists being created in the reverse order to what we require,

hence reverse() is used to flip the order at the end.

3

A gap is indicated in an output profile



by using the None object; this is arbitrary but something that evaluates as false is handy.


Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   179   180   181   182   183   184   185   186   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish