Python Programming for Biology: Bioinformatics and Beyond



Download 7,75 Mb.
Pdf ko'rish
bet104/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   100   101   102   103   104   105   106   107   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

parent-child  relationship.  The  Structure  object  is  the  parent  and  the  Chain  object  is  the

child. Also, we often talk in terms of classes rather than objects and say that Structure  is

the parent class of Chain.

If a parent object contains children, then each child object must belong to that parent; a

child object is only meaningful within the context of its parent. A consequence of this is

that the parent object must be created before its children. Also, if a parent object is deleted

then all of its children must also be deleted. An alternative here would be to make a Chain

free-standing,  meaning  it  could  appear  in  more  than  one  Structure.  That  would  be  a

perfectly plausible scenario, but it is not especially helpful here and not the way we will

model  it.  A  parent  class  can  have  many  different  kinds  of  child  classes.  Thus,  we  could

have  a  Technique  class,  which  represents  how  the  structure  was  determined.  This  could



also  be  a  child  of  Structure.  When  a  Structure  is  deleted  both  Technique  children  and

Chain  children  would  need  to  be  deleted.  To  keep  things  simple,  here  we  just  consider

Chain.  Note  that  Structure  itself  has  no  parent.  Had  we  decided  to  model  things

differently,  we  might  have  introduced  a  class  Ensemble,  as  the  parent  of  Structure,  and

then Ensemble would have no parent.

A  parent-child  relationship  is  an  example  of  a  ‘link’  between  objects.  Links  are

generally harder to manage than simple attributes (like name and pdbId), because there are

two ends to consider, one for each object, and they need to be made consistent with each

other. One way to manage links is to have one of the classes keep track of everything. For

example,  in  the  previous  chapter  we  had  a  parent  class,  Protein,  and  a  child  class,

AminoAcid,  and  the  parent  class  managed  everything.  So  a  Protein  object  had  a  list

self.aminoAcids.  But  an  AminoAcid  object  had  no  reference  to  the  Protein  being  its

parent.  It’s  quite  possible  that  this  is  not  a  problem;  for  example,  an  AminoAcid  might

only appear in contexts where the Protein also appears. Having all the information at one

end of a link makes it easier to manage. On the other hand, it usually makes it harder to

use.


Here  we  choose  to  manage  object  to  object  links  from  both  ends.  Accordingly,  a

Structure  object  will  have  the  attribute  self.chains  to  access  its  children  and  the  Chain

class  will  have  self.structure  to  access  its  parent.  We  will  need  to  keep  both  of  these

attributes synchronised at creation and deletion of a Chain  object.  Both  ends  of  this  link

have cardinality. A Chain object has to have a Structure parent and there can be only one

of  them,  so  that  cardinality  is  ‘1..1’.  A  Structure  object  can  have  any  number  of  Chain

children  (zero  or  more),  and  so  that  cardinality  is  ‘0..*’.  We  will  assume  that  Chain

children are ordered for a given parent, and that this order is the order in which they were

created.

We will model the Chain class as having an identifying code,  which  is  a  string,  and  a

descriptive molType, where the latter can be ‘protein’ or ‘DNA’ or ‘RNA’. Relative to the

parent Structure, we will assume that the code uniquely identifies the Chain. Or to put it

another way, a given Structure object has only one Chain child with a given code.  Thus,

the  object  key  for  a  Chain,  relative  to  its  parent,  is  code  and  the  full  key  is  (structure,

code). This is a typical situation for a child class; it has a key that identifies it relative to

its parent and then together with the parent itself this is the full key for the child. This then

leads  to  the  following  proposals  for  the  implementation  of  Structure  and  Chain

respectively:

class Structure:

def __init__(self, name, conformation=0, pdbId=None):

if not name:

raise Exception('name must be set to non-empty string')

self.name = name

self.conformation = conformation

self.pdbId = pdbId

self.chains = [] # For the children

def delete(self):

for chain in self.chains:




chain.delete()

def getChain(self, code):

for chain in self.chains:

if chain.code == code:

return chain

return None

The  attribute  that  links  the  Structure  parent  to  children  is  self.chains,  and  this  is

initialised  as  an  empty  list,  to  be  filled  in  as  the  child  objects  are  made.  The  delete()

function  is  fairly  straightforward:  when  the  structure  is  deleted  its  chains  disappear.  It  is

notable that there is no specific deletion of the structure object itself, i.e. we don’t do del

self, because it is at the top of the hierarchy and will simply disappear when it is no longer

associated with any Python variables (it will be garbage collected). We have also included

a function to help get hold of a specific Chain object, whereby a code string is accepted as

an argument. Here the function loops through the list of children (self.chains) in order to

find a Chain with a matching attribute, and, because this is the unique key to identify the

Chain within its parent, there will never be more than one possible match.

The Chain class is constructed by accepting the parent structure, the code value as its

key  and  a  molecule  type.  The  __init__()  performs  some  checks  to  make  sure  the

arguments  are  reasonably  sensible,  and  this  includes  determining  whether  the  parent

structure already contains a Chain with the input code, which needs to be unique. If there

are no errors, the attributes are associated with the self. variables. Lastly, the constructor

adds  the  Chain  (represented  here  by  self,  but  filled  in  with  an  actual  object  at  run  time)

onto the structure.chains list, the link from the structure parent to its children. Note that we

do not modify the parent’s link to its child until after we have checked that everything is

ok.

class Chain:



allowedMolTypes = ('protein', 'DNA', 'RNA')

def __init__(self, structure, code, molType='protein'):

if not code:

raise Exception('code must be set to non-empty string')

if molType not in self.allowedMolTypes:

raise Exception('molType="%s" must be one of %s' %

(molType, self.allowedMolTypes))

# check that key code is not already used

chain = structure.getChain(code)

if chain:

raise Exception('code="%s" already used' % code)

self.structure = structure

self.code = code

self.molType = molType

structure.chains.append(self)

def delete(self):

self.structure.chains.remove(self)



The  delete()  function  for  the  Chain  is  a  simple  matter  of  removing  the  object  (again

represented by self  in  the  class  definition)  from  its  parent’s  list  of  children.  Also,  unless

we have a specific handle on the object, all notion of it will be lost and it will eventually

be removed from memory when Python performs garbage collection.

Note that in the above example we have chosen to store the chains of a structure using a

list. That means that the structure.getChain() function is not particularly efficient, because

it  has  to  loop  over  a  list  looking  for  a  matching  item.  If  the  chains  had  been  unordered

(relative  to  the  structure)  then  an  alternative  would  have  been  to  have  a  dictionary,

chainDict, where the key is code and the value is the chain. The chains could be obtained

from chainDict.values(). If we desire both efficiency and ordered children we could have

both the dictionary, chainDict, and the list, chains, although we would have to keep them

synchronised:

6

# Alternative Chain implementation



class Structure:

def __init__(self, name, conformation=0, pdbId=None):

# … initial part as before

self.chainDict = {}

self.chains = []

def getChain(self, code):

return self.chainDict.get(code)

class Chain:

def __init__(self, structure, code, molType='protein'):

# … initial part as before

structure.chainDict[code] = self

structure.chains.append(self)

def delete(self):

del self.structure.chainDict[self.code]

self.structure.chains.remove(self)

Practically, the inefficiency with lists isn’t an issue here because we expect that a given

structure  will  at  most  have  only  a  few  chains,  so  we  will  stick  with  the  simpler  list

implementation.

As a final point on the Chain class, although  we have only  used a single  value as the

identifying  key,  it  would  also  be  possible  to  have  a  key  consisting  of  two  values,  e.g.

(code, molType), where the key is passed around as a tuple.


Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   100   101   102   103   104   105   106   107   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish