Python Programming for Biology: Bioinformatics and Beyond

Download 7,75 Mb.

Pdf ko'rish

bet	139/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 135 136 137 138 139 140 141 142 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Transcription

When the DNA information is read in a process called transcription its double helix is

unwound,

in a small area, so that its bases, on what is called the template strand, are

exposed. These exposed nucleotides specify how an RNA molecule is made. RNA is a

polymer composed of four different types of nucleotide residue just like DNA. RNA

chains do not form stable double helices like DNA, but they can associate with a length of

DNA following the same base-pair rules (the exception being that the T base in DNA is

replaced by the similar U in RNA, which also pairs with A). Each exposed nucleotide in

the DNA chain will only bind one complementary kind of RNA nucleotide, which is put

on the end of the growing RNA chain; thus the DNA sequence dictates the RNA sequence

in a predictable way.

Which physical DNA strand of the two acts as the template to make RNA varies; it

could be either. In other words, regions of both DNA strands are used as RNA templates,

but a specific gene will only use one strand. Accordingly, when an RNA molecule is

made, its sequence mirrors that of the DNA; its nucleotides are joined into a chain by

physically binding the template DNA strand. Given that RNA uses the same base-pair

rules as DNA, it will be complementary to the template. Because the other DNA strand is

also complementary to the template (usually it pairs up to form a helix), so the RNA and

the other DNA strand have the same sequence. The DNA strand which has the same

sequence as the RNA is called the coding strand. When dealing with gene sequences in

computing it is usually the case, for example, when looking in a bioinformatics database,

that you will be working with the sequence of the coding DNA strand, which is the same

as the RNA sequence. Also, even though RNAs really have U bases instead of T bases, in

bioinformatics an RNA sequence will often be presented with Ts, as if they were U;

certainly this is programming laziness, but it does mean that most programs don’t care

whether the sequence came from RNA or DNA, and after all they are often representations

of the same information.

Translation

Most RNA molecules go on to specify protein amino acid sequences in a process called

translation; these are called messenger RNAs (mRNA).

Because there are 20 (common)

types of protein amino acids and only four RNA nucleotides, a combination of RNA

nucleotides is required to specify each amino acid. By a mechanism which we will not get

into, at a point within an mRNA (starting with the sequence ‘AUG’) each subsequent

group of three bases, called a codon, directs one of the 20 common protein amino acids to

be joined onto a growing protein chain. Because DNA can be copied into RNA from either

of its two strands and because on each strand there are three possible ways to group the

nucleotides into codons, DNA has six reading frames, i.e. six possible ways for the same

region to be used to make a protein sequence. Of course one gene only uses one reading

frame, but different genes exploit all of the six possibilities.

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 135 136 137 138 139 140 141 142 ... 514