Python Programming for Biology: Bioinformatics and Beyond


Accessing public databases



Download 7,75 Mb.
Pdf ko'rish
bet156/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   152   153   154   155   156   157   158   159   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Accessing public databases

If, rather  than  getting a  sequence  record from  a  file,  we wish  to  get data  directly  from  a

database then there are a few helper functions in BioPython that allow easy access to some

large  sequence  databases  via  Internet-based  services,  rather  than  having  to  talk  to  the

database  directly.  For  example,  if  we  wish  to  read  a  FASTA-format  file  from  NCBI

GenBank we can do the following, although we naturally have to know the identifier for

the sequence we want.

We import the Entrez module, set the email address attribute (to identify ourselves, as

encouraged  by  the  database)  and  then  call  a  function  to  fetch  a  given  entry  based  on  a

given database type “protein”, return format type “fasta” and sequence identifier number.

from Bio import Entrez

Entrez.email = 'mickey@disney.com'

socketObj = Entrez.efetch(db="protein", rettype="fasta",

id="71066805")




The  above  function  call  gives  back  an  open  socket  object  (assuming  the  connection

works) that can be used in the same way as a normal open file object, although it actually

represents a connection via the Internet. Hence, the reading of the sequence is done in the

same way as for the FASTA file:

dnaObj = SeqIO.read(socketObj, "fasta")

socketObj.close()

print(dnaObj.description)

print(dnaObj.seq)

In  a  similar  way  we  can  read  a  SWISSPROT  record  using  the  ExPASy  module,

although it should be noted that the function to find the sequence and get an open socket

(get_sprot_raw)  is  different  to  before  and  requires  different  arguments,  given  that  such

specifications  depend  on  the  exact  details  of  the  Internet  service  that  the  database

provides.

from Bio import ExPASy

socketObj = ExPASy.get_sprot_raw('HBB_HUMAN')

proteinObj = SeqIO.read(socketObj, "swiss")

socketObj.close()

print(proteinObj.description)

print(proteinObj.seq)


Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   152   153   154   155   156   157   158   159   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish