Conservation and variation
In earlier chapters we have analysed sequences to detect their similarities and thus to form
alignments. The purpose of alignment is not only to group sequences, but also to say in
what precise ways the sequences differ, or are preserved. If we look down a column of
letters along the various positions of a multiple alignment some locations will use more
residue types than others, and if we are considering protein sequences we can see places
where the chemical character of the amino acids may remain the same despite the precise
residues being different. Accordingly, we can measure how conserved or variable a given
position is. Combining many positions we can say how variable a whole region or whole
gene is. We can analyse sequence variability and find changes of biological or medical
importance, and also learn something about the evolution and origins of the sequences.
As discussed, when organisms reproduce, sequence changes naturally occur. However,
not all changes in the DNA are of consequence and many have no effect at all on an
organism. This is because not all DNA has an immediate biological role, and even within
the regions that do there can often be several sequences that perform a job equally well.
Generally, changes in DNA which are not important for biological function occur more
frequently; there is no reason for them not to be passed on.
In the genomes of many organisms there is a high proportion of non-functional, often
repetitive junk DNA between genes. This is not to say that all DNA between genes is
useless, given that such intergenic regions must contain control elements to regulate gene
expression, promoters and enhancers, and also structural DNA to maintain chromosomes,
like telomeres to protect chromosome ends and centromeres to allow replication. Some
regions of non-functional DNA tend to show the highest rate of change during evolution
and thus the largest variation between individuals of a species. Human DNA
fingerprinting, for example, which may be used to identify criminals or detect family
members, works by looking at hypervariable regions that are different in almost every
person. Such fingerprinting would not work nearly so well if gene-coding regions were
used; there would be far fewer differences and finding two individuals with the same
sequence (i.e. not the real criminal) would be much more likely, and in some cases
positively expected.
The task of some analyses, rather than to just detect variations, is to measure the
relative rate of change of variation. If we can find sites where the rate of change of the
sequence is above or below the normal expected value, then this tells us something about
the process of evolution at a fine scale. A common rate measure for variations in the
coding regions of genes that go on to make protein is to look at the number of DNA
substitutions that do change the amino acid sequence, compared to those that do not; the
synonymous, silent substitutions. Remember that the number of three-base codons (64) is
larger than the number of amino acids, and there are usually different ways of coding for
the same amino acid.
In regions that have more silent changes than active ones the acceptance of the
sequences during evolution indicates a purifying selection; this sequence is important and
there is a reason why the protein sequence is preserved. Where there are proportionately
more active changes than silent ones, compared to the average, then this can indicate a
region where there is positive selection. Such regions indicate that the rate of evolution at
these sites is greater than normal and that continuous change and adaptation is
advantageous. Regions of positive selection in the human genome include genes involved
in the immune system, which are ever changing to cope with the continuous appearance of
new harmful bacteria, viruses and parasites.
Do'stlaringiz bilan baham: |