Python Programming for Biology: Bioinformatics and Beyond



Download 7,75 Mb.
Pdf ko'rish
bet225/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   221   222   223   224   225   226   227   228   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Structural subsets

Next  we  will  consider  dissection  of  molecular  structures  into  smaller  parts.  This  sort  of

thing is done in many instances. You may, for example, want to remove a flexible region

from the analysis of your molecule. Alternatively, you might want to select only a certain

kind of residue or certain kinds of atoms. The latter may be done to define the backbone

path  of  the  molecular  chain,  which  is  useful  when  comparing  structures  with  dissimilar

sequences.

The  example  Python  function  we  describe  makes  a  subset  of  a  structure  by  making  a

restricted  copy  of  another  structure,  including  only  the  atoms  which  are  required.

Alternative  methodologies  might  be  to  remove  atoms  from  an  existing  structure,  or  only

load  certain  atoms  in  the  first  place,  and  these  approaches  may  save  a  bit  of  computer

memory. Firstly we import the definitions of the classes of structural objects we wish to

make:

from Modelling import Structure, Chain, Residue, Atom



A  function  is  then  defined  which  takes  an  input  structure  and  three  other,  optional,

arguments  that  specify  which  chains,  residues  and  atoms  to  consider.  If  any  of  these

arguments  is  not  specified  (so  defaults  to  None),  it  is  taken  to  mean  that  no  filtering  is

done  for  that  kind  of  component  and  all  are  included.  The  chainCodes  argument  is

assumed to be a collection of letter codes, e.g. [‘A’, ‘B’], the residueIds is assumed to be a

collection of residue numbers and atomNames, as you might expect, a collection of atom

names.  You  can  use  any  of  the  common  Python  collection  types  here,  list,  tuple  or  set,

although  these  will  be  converted  to  sets  using  set()  to  remove  repeats  and  give  the  best

speed performance.

def filterSubStructure(structure, chainCodes=None,




residueIds=None, atomNames=None):

Within the function we determine a name for the new Structure object we are going to

make by using the template Structure  object’s  name,  and  then  adding  ‘_filter’  plus  other

strings that list which chain codes, residue numbers (converted to strings) and atom names

we have selected. Note how we first check to see if a chain, residue or atom specification

was defined (not None, and hence true) before the name is extended.

name = structure.name + '_filter'

if chainCodes:

name += ' ' + ','.join(chainCodes)

chainCodes = set(chainCodes)

if residueIds:

name += ' ' + ','.join([str(x) for x in residueIds])

residueIds = set(residueIds)

if atomNames:

name += ' ' + ','.join(atomNames)

atomNames = set(atomNames)

Next  the  class  definition  for  Structure  is  used  to  make  a  new  instance  of  that  kind  of

object,  which  we  refer  to  as  filterStruc.  Although  we  defined  a  new  name  for  this  new

object,  we  keep  the  conformation  number  and  PDB  identifier  from  the  original;  these

indicate the origin of the data, and have not changed.

conf = structure.conformation

pdbId = structure.pdbId

filterStruc = Structure(name=name, conformation=conf, pdbId=pdbId)

The main body of the function is to loop through all of the chains, residues and atoms

of  the  input  selecting  only  those  we  wish  to  duplicate.  Thus  first  we  go  through  each

Chain object and, if we have specified a filtering list for its code (chainCodes), we exclude

any  that  are  not  mentioned;  the  loop,  and  hence  chain,  is  skipped  by  using  the  continue

command. If a chain is not excluded then we initialise a list that will contain residues to

copy:

for chain in structure.chains:



if chainCodes and (chain.code not in chainCodes):

continue


includeResidues = []

For  each  included  chain  we  loop  through  its  Residue  objects  and  perform  a  similar

check to see if the residue should be included. If the residueIds argument was filled but the

residue number is not present then that residue is skipped. Otherwise, we go on to collect a

list of atoms.

for residue in chain.residues:

if residueIds and (residue.seqId not in residueIds):

continue



includeAtoms = []

Again,  in  the  same  sort  of  way  we  check  to  see  if  each  atom’s  name  is  in  our  list  of

things to include, and if successful the list of template Atom objects is expanded.

for atom in residue.atoms:

if atomNames and (atom.name not in atomNames):

continue


includeAtoms.append(atom)

If we have notionally decided to include a particular residue but that residue does not

contain any of the required atom types, then there is no need to copy this residue at all.

14

When there are some atoms to copy for this residue, i.e. includeAtoms is not empty, both



the list of atoms and the Residue object are placed in the includeResidues list. We could

have placed the atoms in a big list on their own, but it is convenient to keep them with the

corresponding  residue,  given  that  we  need  to  specify  the  Residue  (parent  object)  when

making an Atom (child object).

if includeAtoms:

includeResidues.append( (residue, includeAtoms) )

If the residue list is not empty, we can make a new chain in the new Structure  object,

which is passed in at Chain creation to specify the parent link. With the chain now made

we  loop  through  the  list  of  residues  and  corresponding  atoms  to  make  new  Residue  and

Atom objects in the new structure. Notice that we use the attributes of the original objects

when making the new ones. Thus, the residue copies will have the same number and code,

and the new atoms will have the same names and coordinates (albeit in a new array). Also,

remember when making these objects within our structure we always have to specify the

parent object, going up the data model hierarchy.

if includeResidues:

filterChain = Chain(filterStruc, chain.code, chain.molType)

for residue, atoms in includeResidues:

filterResidue = Residue(filterChain, residue.seqId,

residue.code)

for atom in atoms:

coords = array(atom.coords)

Atom(filterResidue, name=atom.name, coords=coords)

Finally in the function the new Structure object, with selectively copied components, is

passed back:

return filterStruc

The function can be tested by specifying the chain, residue and atom selection. Here we

select  chain  ‘A’,  all  residues  (so  the  filter  is  None)  and  the  backbone  heavy  atoms  [‘N’,

‘CA’, ‘C’].

chainCodes = set(['A'])



residueIds = None # No residue filter: all of them

atomNames = set(['N','CA','C']) # Heavy backbone atoms (not H)

chain_A_backbone = filterSubStructure(struc, chainCodes,

residueIds, atomNames)

We  could  make  a  dedicated,  streamlined  function  to  make  a  complete  copy  of  a

structure.  However,  using  the  above  filterSubStructure()  function  without  passing  any

chain, residue or atom selection results in a full copy of the input structure. Thus we could

be cheeky and do the following to pretend we had a dedicated copy function:

def copyStructure(structure):

return filterSubStructure(structure, None, None, None)




Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   221   222   223   224   225   226   227   228   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish