Python Programming for Biology: Bioinformatics and Beyond


Example: reverse transcribe a DNA sequence



Download 7,75 Mb.
Pdf ko'rish
bet65/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   61   62   63   64   65   66   67   68   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Example: reverse transcribe a DNA sequence

To  illustrate  what  we  have  described  with  a  more  realistic  situation,  next  we  define  a

function  which  will  take  a  string  of  letters  representing  a  DNA  or  RNA  sequence  as  an

input argument and create the reverse complement sequence, which is the sequence of an

opposing  strand  that  will  form  a  tight  interaction  through  hydrogen-bonding  base  pairs

(see


Chapter  11

 for  further  explanation).  Naturally  we  give  the  function  an  informative

name,  and  specify  two  input  arguments,  one  which  is  mandatory  and  represents  the

sequence and the other which is optional to indicate whether we have a DNA sequence (A,

C, G and T letters) or an RNA sequence (A, C, G and U letters). This optional argument is

named isDna and defaults to True, i.e. that we have a DNA sequence, not RNA.

def reverseComplement(sequence, isDna=True):

from string import maketrans

if isDna:

sequence = sequence.replace('U','T')

transTable = maketrans('ATGC', 'TACG') else:

sequence = sequence.replace('T','U')

transTable = maketrans('AUGC', 'UACG')

complement = sequence.translate(transTable)

reverseComp = complement[::-1]

return reverseComp

Internally  this  function  relies  on  the  translate()  function  which  is  built  into  Python

strings  (like  the  input  sequence)  and  the  maketrans  function  that  is  imported  from  the

string  module;  this  makes  a  character  substitution  table  between  equivalently  positioned

letters from two strings. Also, it is notable that we use the replace() function of strings to

guard against having the wrong kinds of letter (i.e. T versus U) in the input compared to

the isDna argument. The upshot of all of this is that the input sequence has letters swapped

according  to  the  pairs  G

↔  C  and  A  ↔  T  for  DNA  or  A  ↔  U  for  RNA  to  create

complement. The reverse of this reverseComp is generated using the handy slice notation

with  a  negative  step  ([::-1]).  This  final  string  is  what  we  want  to  pass  back  from  the

function, and thus we use it with return at the end. The function is readily tested with some

example sequence strings:

seq1 = 'GATTACA'

seq2 = "AUGGUG"

print(reverseComplement(seq1)) # TGTAATC

print(reverseComplement(seq1, isDna=False)) # UGUAAUC

print(reverseComplement(seq2, False)) # CACCAU


Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   61   62   63   64   65   66   67   68   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish