Python Programming for Biology: Bioinformatics and Beyond


Reading sequences from FASTQ files



Download 7,75 Mb.
Pdf ko'rish
bet262/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   258   259   260   261   262   263   264   265   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Reading sequences from FASTQ files

A FASTQ-format sequence file is read by using the imported reader class FastqReader  to

make an object that represents the open file. It is then a simple matter to extract each of

the sequence records that is represented in the file by looping through this object. Each of




the  seqRes  objects,  as  it  appears  in  the  loop,  is  a  SequenceWithQualities  class  from  the

HTSeq library, and as such it comes with lots of inbuilt functionality and some of this is

demonstrated below: printing the name, the sequence itself and the reverse complement of

the  read  (here  the  slice  notation[::-1]  gives  the  sequence  in  reverse  relative  to  the  main

sequence).

fileObj = FastqReader(fastqFile)

for seqRead in fileObj:

print(seqRead.name)

print(seqRead.seq)

print(seqRead.get_reverse_complement()[::-1])

This gives a result like:

r999


AGGATAATGAGGCGAGCCGGGGGAACTGAAANTGG

TCCTATTACTCCGCTCGGCCCCCTTGACTTTNACC

Given  that  these  sequence  records  come  from  reading  a  file  format  that  incorporates

quality  scores  we  can  naturally  interrogate  those  scores.  In  this  example  we  generate  a

graph of the mean score along the alignment positions. The meanQual initially starts out

as None and for the first sequence is set to a NumPy array of the scores seqRead.qual. For

subsequent  records  the  scores  are  then  added  (element  by  element  as  is  the  standard

NumPy way) to this array, so that at the end the whole array can be divided by numReads

to give the average value along the sequence, which is plotted with the pyplot library.

numReads = 0.0

meanQual = None

for seqRead in fileObj:

print(seqRead.qual)

if meanQual is None:

meanQual = array(seqRead.qual)

else:


meanQual += seqRead.qual

numReads += 1.0

if numReads:

pyplot.plot(meanQual/numReads)

pyplot.show()


Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   258   259   260   261   262   263   264   265   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish