Once we have some sequences that we are sure are related to one another, we can begin to
look at how the sequences differ, despite the common connection. Such sequences are
often different versions of a gene, which function in the same way, from different
organisms. The basic principle of this type of analysis is that when the biological role of a
particular set of DNA sequences (and thus also any protein produced) is conserved, the
residues in the sequence that are important for this function are also conserved, but those
that are not so important are more free to vary. DNA sequences naturally change when
cells divide (and the changes are passed on to future generations when organisms
reproduce) because of the error-prone nature of DNA replication. If a sequence change
occurs that is detrimental to the function of the cell or organism, then the change will tend
not to be passed on; the cell may die, offspring may not survive or the descendants will
not be as successful as those within the population that are unaltered. Conversely, changes
that are of little or no detriment will be tolerated. These could be at unimportant genetic
locations, for example, the last position of a codon is often irrelevant for determining
which amino acid is produced; or they could be variations that do cause a noticeable
change but which function just as well, like when one amino acid changes for another that
can act in the same way.
Simply by aligning sequences and discovering positions that significantly preserve
residue type we can tell that those positions are important, even if we do not yet know
why they are important. Also, if we can classify sequences that we know act differently
despite being similar, then the individual changes in the sequence can often explain why
the sequences as a whole act differently. To take an example from the study of genetic
diseases, if you look at the beta-globin gene in people who have sickle-cell anaemia and
compare it to those who do not have the disease, it is very easy to generate a sequence
alignment to see that there is a change in the DNA, and hence protein sequence, of the
seventh codon which is only present in those with the disease.
2
Further investigation
shows that this change really is the underlying cause of the disease; it causes haemoglobin
to stick together aberrantly.
When you look in detail at positions in a protein sequence and measure how well the
residues are preserved, then the reasons and effects are often best understood by
considering the folded structure of the protein; i.e. by considering the three-dimensional
locations of the atoms. Amino acid residues that are involved in a specific chemical
reaction that is catalysed by the protein, at its active site, are usually very well preserved.
Other residues, for example, in the folded core of the protein, may be well conserved
because of their importance in determining the shape of the protein, although some
variation will be tolerated in the amino acids if they are replaced by similar types that fit
together in a similar way. Positions that are not so important for the shape of a protein,
generally the residues on the surface and those in flexible regions, will tend to vary the
most. However, even in such locations there are some constraints on which amino acids
are tolerated for normal function; for example, a change could make a necessary flexible
region inflexible.
If we step backwards from the scale of an individual gene or protein and look at the
context of lots of genes on the chromosomes which make up a whole genome, then we can
observe trends that show how the genome as a whole is evolving. A good example of this
is that when the human genome is compared to the chimpanzee genome it becomes
apparent that the human chromosome 2
3
has no single chimpanzee equivalent; indeed
there are two chimp chromosomes that correspond to the human one. We are certain of
this because the relative location and identity of equivalent human genes is preserved,
even if the length of chromosome differs. Going on from this, further analysis shows that
the human chromosome has been created from the merging of two smaller ones; other
monkeys and apes have two rather than one, so we are sure that two chromosomes is the
ancestral situation.
Do'stlaringiz bilan baham: