Python Programming for Biology: Bioinformatics and Beyond

Download 7,75 Mb.

Pdf ko'rish

bet	211/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 207 208 209 210 211 212 213 214 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Protein structure

When we determine the amino acid sequence of a protein we gain the knowledge of which

types of residue have been linked into a polypeptide chain. Because we know the chemical

structure of the individual amino acid components, and because protein chains are formed

in a regular and predictable way, we therefore know virtually all of the atoms and covalent

bond connections

that are present in the entire protein molecule. In general the only

deviations from this overall chemical structure will occur where small parts of specific

chemical groups are not static, as when hydrogen ions hop on and off acidic residues, or if

the protein is subsequently modified by enzymes. Such post-translational modifications

include the formation of cross-links (between cysteine residues), cutting of the peptide

backbone and the addition of other moieties like sugars, fats and phosphate groups. While

modifications complicate the affair, if we are unable to do specific experiments to

determine what has happened (e.g. mass spectrometry) then we can often discover what

has occurred once we determine the overall three-dimensional structure.

We will now consider why proteins fold into their respective shapes. Protein folding is a

deep topic because the number of potential conformations for a typical polypeptide chain

is vast and the relationship between protein sequence and structure is generally not

predictable. Even where it is possible to investigate a sufficient number of hypothetical

three-dimensional arrangements, knowing which arrangement is correct, the native

conformation observed in nature, from purely theoretical considerations requires

exceedingly long computational calculations. Fortunately, in molecular biology we

generally don’t have to make such tricky predictions because we can determine protein

structure by performing experiments and making observations. Because protein folding is

an exceedingly complex topic, most discussions about its mechanisms are well beyond the

remit of this programming book. Nevertheless we will describe some of the basic

principles, specifically what kind of forces are involved in holding a protein structure

together, because this helps us understand the features we observe in structure data.

Overall the folding of molecules can be thought of in terms of energy. The atoms of a

molecule, because they are in constant thermal motion,

are able to change relative

position so that the overall conformation moves towards the lowest, most stable energy.

Generally you can think of this as the three-dimensional arrangement that forms the most

stabilising interactions between atoms. Strictly speaking a molecule will not be static, at

its energy minimum, because it will move about due to temperature (it has kinetic energy).

Accordingly, we often think of a molecule’s native state as being a set of similar

conformations that are close to the energy minimum, albeit bumbling about. It should

always be remembered that the higher the temperature the wider are a molecule’s motions

and the further it can stray.

Proteins fold into compact, globular structures because of the way amino acids interact

with one another and whether they interact (or do not interact) with water molecules, the

primary biological solvent that surrounds them. Sometimes a protein will have cysteine

residues that form covalent disulphide links (under oxidising conditions) that tie different

parts of the protein together, but most of the compactness and precision of folding is due

to weaker, non-covalent interactions, including those with water molecules. In simple

terms the residues that can form stabilising interactions with water lie on the outside and

those that cannot lie on the inside (in the core). Admittedly there are some kinds of

proteins that aren’t really dissolved in water directly, including those that are embedded in

lipid bilayers (the fatty membranes that surround cells and their internal compartments).

However, even here it is the ability of particular amino acids to interact with or avoid

water that is behind the formation of a compact structure.

The atoms around the peptide links, which form the backbone of a protein’s amino acid

chain, are capable of interacting in a stabilising way with water and amongst themselves;

the amide (N-H) and carboxyl (C=O) groups form polar hydrogen bonds. All things being

equal the interaction with water is stronger, but the other parts of the amino acids that stick

out from their backbone, the side chains, tip the balance so that the protein backbone is

mostly stabilised by the backbone atoms hydrogen bonding with each other, and not water.

The different amino acids have chemical structures that govern whether their side chain

can make a significant interaction with water. Side chains containing atomic groups that

can form relatively strong hydrogen bonds (O-H, N-H, C=O) and those that carry an

electric charge are said to be hydrophilic (water-loving), because they can make stabilising

interactions with water. Those that do not are described as hydrophobic (water-hating).

Strictly speaking there is not a set dividing line between hydrophobic and hydrophilic; it is

more a matter of degree. In an aqueous (water) environment, the hydrophobic and

hydrophilic residues segregate when a protein folds, to form a hydrophobic core and

hydrophilic exterior, i.e. a globule. This is just a general trend though; the protein globule

is stabilised further by the hydrogen bonds along the backbone, which tend to form regular

patterns of hydrogen-bonding networks, called secondary structure. Also, the electric

charges and polarities will push and pull the structure into the final shape. This final

conformation is one where the core residues (mostly hydrophobic) come together and give

rise to another weaker, but widespread, kind of interaction described as the van der Waals

force, and thus the core packs tightly. This weak non-bonding interaction is actually

present all the time between close atoms, including those from water, but in many

situations it is swamped by other, stronger interactions. As a final point on protein folding,

it should be noted that some large sections of amino acid sequences do not have a

significant hydrophobic component. These regions will typically not form a single stable,

folded structure because they don’t have the ability to form a hydrophobic core. Usually

this results in the region being highly dynamic or unstructured and is commonly seen at

the ends of protein chains and as flexible linkers between folded domains, which are

compact and globular.

Protein structure is often described in terms of a structural hierarchy, which helps us

understand the final form as a combination of smaller elements; which is to say nothing

about the actual mechanism of folding. This hierarchy is roughly described as follows:

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 207 208 209 210 211 212 213 214 ... 514