Python Programming for Biology: Bioinformatics and Beyond

Download 7,75 Mb.

Pdf ko'rish

bet	162/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 158 159 160 161 162 163 164 165 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Conservation analysis

Once we have some sequences that we are sure are related to one another, we can begin to

look at how the sequences differ, despite the common connection. Such sequences are

often different versions of a gene, which function in the same way, from different

organisms. The basic principle of this type of analysis is that when the biological role of a

particular set of DNA sequences (and thus also any protein produced) is conserved, the

residues in the sequence that are important for this function are also conserved, but those

that are not so important are more free to vary. DNA sequences naturally change when

cells divide (and the changes are passed on to future generations when organisms

reproduce) because of the error-prone nature of DNA replication. If a sequence change

occurs that is detrimental to the function of the cell or organism, then the change will tend

not to be passed on; the cell may die, offspring may not survive or the descendants will

not be as successful as those within the population that are unaltered. Conversely, changes

that are of little or no detriment will be tolerated. These could be at unimportant genetic

locations, for example, the last position of a codon is often irrelevant for determining

which amino acid is produced; or they could be variations that do cause a noticeable

change but which function just as well, like when one amino acid changes for another that

can act in the same way.

Simply by aligning sequences and discovering positions that significantly preserve

residue type we can tell that those positions are important, even if we do not yet know

why they are important. Also, if we can classify sequences that we know act differently

despite being similar, then the individual changes in the sequence can often explain why

the sequences as a whole act differently. To take an example from the study of genetic

diseases, if you look at the beta-globin gene in people who have sickle-cell anaemia and

compare it to those who do not have the disease, it is very easy to generate a sequence

alignment to see that there is a change in the DNA, and hence protein sequence, of the

seventh codon which is only present in those with the disease.

Further investigation

shows that this change really is the underlying cause of the disease; it causes haemoglobin

to stick together aberrantly.

When you look in detail at positions in a protein sequence and measure how well the

residues are preserved, then the reasons and effects are often best understood by

considering the folded structure of the protein; i.e. by considering the three-dimensional

locations of the atoms. Amino acid residues that are involved in a specific chemical

reaction that is catalysed by the protein, at its active site, are usually very well preserved.

Other residues, for example, in the folded core of the protein, may be well conserved

because of their importance in determining the shape of the protein, although some

variation will be tolerated in the amino acids if they are replaced by similar types that fit

together in a similar way. Positions that are not so important for the shape of a protein,

generally the residues on the surface and those in flexible regions, will tend to vary the

most. However, even in such locations there are some constraints on which amino acids

are tolerated for normal function; for example, a change could make a necessary flexible

region inflexible.

If we step backwards from the scale of an individual gene or protein and look at the

context of lots of genes on the chromosomes which make up a whole genome, then we can

observe trends that show how the genome as a whole is evolving. A good example of this

is that when the human genome is compared to the chimpanzee genome it becomes

apparent that the human chromosome 2

has no single chimpanzee equivalent; indeed

there are two chimp chromosomes that correspond to the human one. We are certain of

this because the relative location and identity of equivalent human genes is preserved,

even if the length of chromosome differs. Going on from this, further analysis shows that

the human chromosome has been created from the merging of two smaller ones; other

monkeys and apes have two rather than one, so we are sure that two chromosomes is the

ancestral situation.

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 158 159 160 161 162 163 164 165 ... 514