Python Programming for Biology: Bioinformatics and Beyond

Figure 12.4. A simple feed-forward network to illustrate the principles of dynamic

Download 7,75 Mb.

Pdf ko'rish

bet	172/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 168 169 170 171 172 173 174 175 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Figure 12.4. A simple feed-forward network to illustrate the principles of dynamic

programming. The objective of the dynamic programming algorithm is to find the best

route through the network without having to consider all possible routes. By finding the

best route to each intermediate location the algorithm excludes many route possibilities.

As illustrated, starting at A you must go via B or C then through D or E, then F or G

before finally getting to H. Each leg of the journey takes a different time and all of these

times are added to get the total travel time. If I want to find the quickest journey from A to

H then I have to potentially consider all of the possible routes, A, B, D, F, H, A, C, E, F, H

etc. … It might seem that you have to calculate the time for all possible journeys before

you know which way is best. However, this problem can be broken into smaller sub-

problems which allow us to discard certain routes at an early stage. For example, we can

calculate the shortest journey time from A to D and A to E; potentially these could go

through either B or C, but if both of the shortest routes go through B we need never

consider point C ever again. When extending the routes through F and G we only have to

extend our best routes from D and E, which we have already determined only use point B.

In general, if we add up journey times as we go, and given that at each point there are only

two previous directions that we could have come from, we can choose the best direction to

get to that point and ignore all routes that come from the other direction. Thus in the end

when we reach H there are only two routes to choose between, one from F and one from

G. At the start we couldn’t know which of F or G would be optimal in the end, but in

calculating the routes (sub-problems) to these final points we have only been following the

best paths for each stage, which is much more efficient than following them all. Sequence

alignment is analogous to this kind of journey analysis, the difference being that we want

the route of maximum alignment score (not minimum time) and we have three routes into

each point rather than two; the three routes are (i) to put a gap in one sequence, (ii) put a

gap in the other sequence or (iii) align two residues.

If we start from the beginning of two sequences and consider the alignment growing by

inserting a gap in one sequence or the other, or by putting the next two residues together,

at each point the number of possibilities grows by a multiple of three each time. However,

to get to a particular residue pairing there are three routes that it could have come from;

and like in the above example only the best one to this point needs to be continued.

Specifically, what sequence alignment does is to extend the (sub-total) scores of these

three routes with gap penalties or a similarity score for the next residue pair. By taking the

maximum of these scores the other two are discarded and their routes cut; two of the sub-

alignments that get to this point do not need to be extended any further, because there is

already a better alignment to get to the same place. Every time the alignment is extended,

multiple new possibilities are generated, but at the same time previous sub-alignments can

be disregarded. By repeatedly extending the alignment, and pruning sub-optimal

alternatives along the way, the end of the sequences is eventually reached. Initially as the

sequences are compared, there is no way to know which sub-alignments are discarded and

which win out, but the decisions that are taken (whether to add gaps or pair up residues) at

each step along the way can be remembered. So, starting from the end point the winning

decisions can be followed, backwards, to find what gave the best score at each point and

hence the best alignment.

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 168 169 170 171 172 173 174 175 ... 514