When we determine the amino acid sequence of a protein we gain the knowledge of which
types of residue have been linked into a polypeptide chain. Because we know the chemical
structure of the individual amino acid components, and because protein chains are formed
in a regular and predictable way, we therefore know virtually all of the atoms and covalent
bond connections
that are present in the entire protein molecule. In general the only
chemical groups are not static, as when hydrogen ions hop on and off acidic residues, or if
the protein is subsequently modified by enzymes. Such post-translational modifications
include the formation of cross-links (between cysteine residues), cutting of the peptide
backbone and the addition of other moieties like sugars, fats and phosphate groups. While
modifications complicate the affair, if we are unable to do specific experiments to
determine what has happened (e.g. mass spectrometry) then we can often discover what
has occurred once we determine the overall three-dimensional structure.
We will now consider why proteins fold into their respective shapes. Protein folding is a
deep topic because the number of potential conformations for a typical polypeptide chain
is vast and the relationship between protein sequence and structure is generally not
predictable. Even where it is possible to investigate a sufficient number of hypothetical
three-dimensional arrangements, knowing which arrangement is correct, the native
conformation observed in nature, from purely theoretical considerations requires
exceedingly long computational calculations. Fortunately, in molecular biology we
generally don’t have to make such tricky predictions because we can determine protein
structure by performing experiments and making observations. Because protein folding is
an exceedingly complex topic, most discussions about its mechanisms are well beyond the
remit of this programming book. Nevertheless we will describe some of the basic
principles, specifically what kind of forces are involved in holding a protein structure
together, because this helps us understand the features we observe in structure data.
Overall the folding of molecules can be thought of in terms of energy. The atoms of a
molecule, because they are in constant thermal motion,
3
are able to change relative
position so that the overall conformation moves towards the lowest, most stable energy.
Generally you can think of this as the three-dimensional arrangement that forms the most
stabilising interactions between atoms. Strictly speaking a molecule will not be static, at
its energy minimum, because it will move about due to temperature (it has kinetic energy).
Accordingly, we often think of a molecule’s native state as being a set of similar
conformations that are close to the energy minimum, albeit bumbling about. It should
always be remembered that the higher the temperature the wider are a molecule’s motions
and the further it can stray.
Proteins fold into compact, globular structures because of the way amino acids interact
with one another and whether they interact (or do not interact) with water molecules, the
primary biological solvent that surrounds them. Sometimes a protein will have cysteine
residues that form covalent disulphide links (under oxidising conditions) that tie different
parts of the protein together, but most of the compactness and precision of folding is due
to weaker, non-covalent interactions, including those with water molecules. In simple
terms the residues that can form stabilising interactions with water lie on the outside and
those that cannot lie on the inside (in the core). Admittedly there are some kinds of
proteins that aren’t really dissolved in water directly, including those that are embedded in
lipid bilayers (the fatty membranes that surround cells and their internal compartments).
However, even here it is the ability of particular amino acids to interact with or avoid
water that is behind the formation of a compact structure.
The atoms around the peptide links, which form the backbone of a protein’s amino acid
chain, are capable of interacting in a stabilising way with water and amongst themselves;
the amide (N-H) and carboxyl (C=O) groups form polar hydrogen bonds. All things being
equal the interaction with water is stronger, but the other parts of the amino acids that stick
out from their backbone, the side chains, tip the balance so that the protein backbone is
mostly stabilised by the backbone atoms hydrogen bonding with each other, and not water.
The different amino acids have chemical structures that govern whether their side chain
can make a significant interaction with water. Side chains containing atomic groups that
can form relatively strong hydrogen bonds (O-H, N-H, C=O) and those that carry an
electric charge are said to be hydrophilic (water-loving), because they can make stabilising
interactions with water. Those that do not are described as
hydrophobic (water-hating).
Strictly speaking there is not a set dividing line between hydrophobic and hydrophilic; it is
more a matter of degree. In an aqueous (water) environment, the hydrophobic and
hydrophilic residues segregate when a protein folds, to form a hydrophobic core and
hydrophilic exterior, i.e. a globule. This is just a general trend though; the protein globule
is stabilised further by the hydrogen bonds along the backbone, which tend to form regular
patterns of hydrogen-bonding networks, called secondary structure. Also, the electric
charges and polarities will push and pull the structure into the final shape. This final
conformation is one where the core residues (mostly hydrophobic) come together and give
rise to another weaker, but widespread, kind of interaction described as the van der Waals
force, and thus the core packs tightly. This weak non-bonding interaction is actually
present all the time between close atoms, including those from water, but in many
situations it is swamped by other, stronger interactions. As a final point on protein folding,
it should be noted that some large sections of amino acid sequences do not have a
significant hydrophobic component. These regions will typically not form a single stable,
folded structure because they don’t have the ability to form a hydrophobic core. Usually
this results in the region being highly dynamic or unstructured and is commonly seen at
the ends of protein chains and as flexible linkers between folded domains, which are
compact and globular.
Protein structure is often described in terms of a structural hierarchy, which helps us
understand the final form as a combination of smaller elements; which is to say nothing
about the actual mechanism of folding. This hierarchy is roughly described as follows:
Do'stlaringiz bilan baham: