may be represented in several different ways. In the simplest form, one-letter codes are
listed sequentially, where each letter represents a different kind of amino acid. The
sequence may also be represented by three-letter amino acid codes. For both kinds of
sequence the amino acids are listed in order starting from the N-terminus, which has an
underlying chemical structure, which in most biological situations adopts a particular
The amino acids that are linked into a protein chain are often referred to as residues.
The origin of this term is somewhat archaic; it stems from the early days of biochemistry.
When the sequence of amino acids in a protein was first discovered, it was done by
carefully removing only the amino acid at the start of the protein chain using chemical
protein, and the amino acid was left as a chemical residue (i.e. the leftover) from the
cleavage reaction. Successive rounds of amino acid removal, on a shortening protein
chain, gives successive chemical residues, each of which corresponds to a particular kind
of amino acid. Thus the order of the kinds of chemical residue reveals the order of the
when one wants to refer to a particular amino acid in a particular position of a protein
chain. The term is also frequently used in the same way for the entities that make up
chains of DNA and RNA, the other types of biological molecules that have a linear
When a protein is constructed it is made inside a living cell by joining amino acids
together via peptide links, in the correct order for that type of protein
2
in a process called
translation. The information about which one of the 20 types of amino acid is joined to the
previous one in the sequence, at the growing end of a protein chain, is determined by a
different kind of molecule; an RNA. RNA molecules are also made up of chains of smaller
entities, which in this case are called nucleotides (completely different to amino acids that
are found in proteins). RNA molecules in this instance can be thought of as messages,
because they are relaying the information to create proteins. The origin of the sequence
information that RNA transfers to protein ultimately comes from DNA, arguably the most
famous of the biological molecules. It should be noted that not all RNA molecules are
used to make proteins; the non-coding RNAs have various other roles in a cell.
The sequence of components in RNA is essentially a short-lived copy of the
information that is stored in molecules of DNA. So even though the actual chemical
reactions of life mostly happen because of proteins, the blueprint of how to make the
proteins comes from the DNA. DNA is the permanent store of information present in
every cell. There is a little caveat to this point because some cells, like red blood cells in
human beings, lose their DNA. For the red blood cell this gives it more space to fulfil its
role of carrying oxygen around the body, at the cost of having a short lifespan: its RNA
messages will eventually run out and it will no longer be able to make new protein (which
all cells must do to survive).
DNA
DNA is present in a cell because it was passed from parent to offspring. Half of your DNA
sequence will come from your mother and half from your father. Of the total DNA inside a
cell, only part of it will be used to make RNA messages, and thus ultimately proteins. The
regions of DNA that are used to make RNA, by specifying its sequence, are called genes.
The remainder of the DNA that is not part of any gene may have a biological role or it
may be junk. Junk DNA does not have any specific function, but it is perhaps useful in
providing space around genes so that life can evolve by shuffling genes without damaging
them.
The parts of DNA that are neither junk nor genes are critically important. Included in
such regions are DNA sequences that determine which genes are actually used on a given
occasion. For example, consider a brain cell and a muscle cell inside a human; both cells
have the same DNA but one helps you think and the other helps you move. The different
jobs that the different cells do are only possible because they make different kinds of
protein molecules. They make different protein molecules because different sets of genes
are active. In each type of cell some genes will be switched off and some will be switched
on. It is DNA that lies outside a gene that provides these on/off switches (often near the
starts of genes). We will forego detailed discussion about how these gene switches are
controlled, but suffice it to say that in the case of muscle cells and brain cells in humans
the initial difference in gene activation is made early in development, when a baby is just a
tiny embryo.