Python Programming for Biology: Bioinformatics and Beyond


Phylogenetic trees using neighbour-joining



Download 7,75 Mb.
Pdf ko'rish
bet205/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   201   202   203   204   205   206   207   208   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Phylogenetic trees using neighbour-joining

Tree-building can be a very difficult task. This is basically because the number of branch

combinations  increases  very  rapidly  with  the  number  of  input  taxa.  The  number  of

combinations  can  be  so  vast  that  we  cannot  routinely  test  all  combinations  to  give  the

optimum  tree  arrangement.  Hence  what  is  often  done,  and  which  we  will  do  here,  is  to

generate  a  quite  good,  but  not  optimal,  tree  using  a  computationally  fast  method.  The

method used here is the one introduced by Saitou and Nei; the neighbour-joining method.

5

This is a quick way of generating a tree that is usually not so far from a global optimum



and  which  is  also  relatively  easy  to  understand  mechanistically.  The  generated  tree  will

not necessarily be at the global optimum because the algorithm is ‘greedy’ in that it only

optimises the local solution for pairing up sequences, i.e. it never considers the tree as a

whole.  Nonetheless  it  is  still  a  speedy  and  useful  example  to  be  used  in  subsequent

analyses,  and  it  can  also  provide  a  starting  point  for  more  globally  aware  optimising

methods, if more accuracy is required. As with most of the examples in this book, the aim

of the tree-generating example is not to give a high-performance, gold-standard program,

but  rather  to  lead  you  through  a  relatively  simple  piece  of  code  that  can  be  readily

understood and which illustrates the major points of the topic.

The  neighbour-joining  algorithm  involves  repeatedly  finding  the  closest  pair  from

amongst the input sequences (and sub-trees after the first cycle) and joining these to form

a new larger sub-tree until all sequences have been considered and only one, fully joined

tree  remains.  Thus  the  first  step  for  the  tree-building  example  is  to  calculate  what  is

termed a distance matrix, where the rows and columns of the matrix represent the different

input sequences and the matrix elements represent the evolutionary distance between those

sequences.  Here  we  will  only  compare  isolated  input  sequences;  however,  the  method

could be adapted to consider multiple sequences for each taxon (e.g. each input is a group



of  sequences  representing  a  different  species)  so  that  the  evolutionary  distance  between

organisms is better represented as a whole. The tree that will result in the end is what is

called an unrooted tree.  This  means  that  the  tree  has  no  indication  of  which  taxon  came

first, evolutionarily speaking, and hence where the common (but unseen) ancestor fits in.

By  making  an  unrooted  tree  we  are  only  showing  the  branch  relationships  between  the

input taxa. We recommend further reading if you wish to build rooted trees: those with an

ancestor start point.

The  evolutionary  distance  that  separates  two  sequences  in  this  instance  will  be

estimated using the scores that are generated by performing pairwise sequence alignment.

Thus  the  values  will  depend  on  the  similarity  of  paired  residues  as  defined  by  a

substitution  matrix.  With  DNA  sequences  it  is  common  to  use  a  more  complex

evolutionary  model,  to  say  how  each  change  relates  to  distance;  there  are  many  more

sophisticated ways of calculating the distance between sequences, not all of which can be

used  with  the  relatively  simple  neighbour-joining  method  used  here.  However,  by  using

the substitution matrices we will keep things very simple for illustrative purposes and be

able  to  use  existing  Python  functions.  For  further  reading  on  distance  matrix  generation,

and tree generation in general, the PHYLIP

6

software package is a good starting point.




Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   201   202   203   204   205   206   207   208   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish