The same issues come up all over again with Residue, a child class of Chain, and Atom, a
child class of Residue. For Residue we will assume that it has an identifying key (relative
to its parent) called seqId, and an optional attribute called code. The Chain class will get
None if there isn’t one). We will assume that the Residue children of a Chain are ordered,
according to the order of their creation. However, this time we will use both a dictionary
and a list in the implementation because a chain can have many (so hundreds of) residues
and we want an efficient implementation for chain.getResidue() and a list of the residues
in sequential order. We will also add another function, getAtoms(), into the Chain class,
which will return all the atoms of all the residues in the chain.
For Atom we will assume that it has a key to identify it, within its parent Residue,
called name, and an additional mandatory attribute coords giving the three-dimensional
(X, Y and Z) coordinates of the atom. It is a design decision to make coordinates
mandatory, and it would be perfectly valid to instead make them optional, so we could
represent ‘no 3D information’. We will assume that the Atom children of a Residue are
ordered, by order of creation. We are going to add a getAtom() function into the Residue
class and so we will again use a dictionary to make this efficient.
This leads to the following proposal for the implementation of Residue and Atom, and a
modified implementation of Chain, noting that there is nothing especially tricky here and
the class construction uses the concepts already described:
class Chain:
allowedMolTypes = ('protein', 'DNA', 'RNA')
def __init__(self, structure, code, molType='protein'):
# … initial part as before
self.resDict = {} # Children
self.residues = [] # Children
structure.chains.append(self) # Parent's link
def delete(self):
for residue in self.residues:
residue.delete()
self.structure.chains.remove(self)
def getResidue(self, seqId):
return self.resDict.get(seqId)
def getAtoms(self):
atoms = []
for residue in self.residues:
atoms.extend(residue.atoms)
return atoms
For the Residue class remember that the unique key to identify it from its parent Chain
is the seqId, so this is what is used in the getChain() look-up to make sure we don’t have
any repeats. When we construct a Residue it goes in its parent’s chain.resDict, for quick
look-up (with the seqId), and in the chain.residues list, to have the objects in order (an
alternative would be a single ordered dictionary from the collections module; available
from Python 2.7). When a Residue is deleted both of these operations are reversed; we
remove its reference from both the list and dictionary.
class Residue:
def __init__(self, chain, seqId, code=None):
if not seqId:
raise Exception('seqId must be set to non-empty string')
residue = chain.getResidue(seqId)
if residue:
raise Exception('seqId="%s" already used' % seqId)
self.chain = chain
self.seqId = seqId
self.code = code
self.atomDict = {} # Children
self.atoms = [] # Children
chain.resDict[seqId] = self # Parent's link
chain.residues.append(self) # Parent's link
def delete(self):
for atom in self.atoms:
atom.delete()
del self.chain.resDict[self.seqId]
self.chain.residues.remove(self)
def getAtom(self, name):
return self.atomDict.get(name)
Lastly for the Atom, it is the same approach again. In this case the key to identify an
atom is its name, so this is used to check for repeats and in the Residue’s dictionary to
look up its children. Because this is the final class in our data model there are no children
of Atom. The atom record naturally holds the important coordinate information, which
defines the three-dimensional structure as coords, a NumPy array containing x, y and z
axis positions.
7
We are using an array for this to make geometric manipulations easier.
Note that we check the coords is a collection of three items, although we could be more
rigorous and check the data type etc. Also, we have an attribute to state what chemical
element the atom is, which may not be obvious from the name.
from numpy import array
class Atom:
def __init__(self, residue, name, coords, element):
if not name:
raise Exception('name must be set to non-empty string')
atom = residue.getAtom(name)
if atom:
raise Exception('name="%s" already used' % name)
if len(coords) != 3:
raise Exception('Coordinates must contain three values')
self.residue = residue
self.name = name
self.coords = array(coords)
self.element = element
residue.atomDict[name] = self # Parent's link
residue.atoms.append(self) # Parent's link
def delete(self):
del self.residue.atomDict[self.name]
self.residue.atoms.remove(self)
At each level, in the constructor and delete() functions, you need to look upwards to the
parent and downwards to the children.
Do'stlaringiz bilan baham: