Representing structures as NumPy arrays
The examples above used standard Python loops and didn’t make full use of the NumPy
array objects that store the atom coordinates. Accordingly, given that each atom position is
described as a single NumPy array (a vector of x, y and z coordinates) each atomic
coordinate can be placed inside another larger array, to form a two-dimensional matrix
which represents the whole structure. To be able to work with whole 2D arrays of
coordinates we must first transfer each atom’s coordinates from the data model into the 2D
array. Once defined, the 2D array data can be manipulated without having to repeatedly
loop through each atom in turn; most operations can work on all atoms at the same time.
Internally there will actually be looping going on, but we do not have to think about that
and can trust that it will occur in an efficient and expected manner.
The following function loops through all of the atoms in a structure and places the
coordinates in a NumPy array object. This function is thus general and can be used any
time we wish to move from the hierarchical molecular data model to a numeric array
system. Firstly, we import the required functions from the numpy module: array which
will convert a Python list or tuple and dot for matrix multiplication.
from numpy import array, dot
def getAtomCoords(structure):
coords = []
atoms = []
for chain in structure.chains:
for residue in chain.residues:
for atom in residue.atoms:
coords.append(atom.coords)
atoms.append(atom)
return atoms, array(coords)
Inside the function we simply loop through the chains, residue and atoms and append
the atom coordinates to a larger list of coordinates. We also collect a list of the atom
objects at the same time, so that we will know which coordinate goes with which atom;
obviously useful if we want to update the atom positions after changing the coordinates.
Finally, we return the list of atoms and the coordinates, taking special notice of the fact
that the coordinates are converted from the normal Python list into a NumPy array object
(which here will be two-dimensional).
We will now use this array-generating function within another function that rotates an
input structure. This does exactly the same job as the rotation function described earlier,
but the innards are much simpler to write because we use one 2D array for all the
coordinates and avoid writing loops. As before, the function takes an input angle and axis
to define a rotation matrix which is converted to an array. The getAtomCoords() function
just described is used to get the array object that represents all of the coordinates.
def rotateStructureNumPy(structure, axis=array([1,0,0]), angle=0):
rMatrix = array(getRotationMatrix(axis, angle))
atoms, coords = getAtomCoords(structure)
coords = dot(coords, rMatrix.T)
for index, atom in enumerate(atoms):
atom.coords = list(coords[index])
The actual rotation operation is exceedingly simple: we just use the NumPy dot()
operation to multiply the coordinate array by the rotation matrix;
9
this can be described as
applying a coordinate transformation. Finally, the updated coordinates are used to update
the coordinates of the atom objects. Note that we loop through the atom objects with the
enumerate() function, to get not only the atom but also an index position. This index is
then used to identify the correct location in the coordinate array, given that the order of the
atom list exactly matches the order of coordinates; the getAtomCoords()function
guarantees this. The position for each atom is updated in one operation by assigning one
coordinate array (containing three numbers) to the three atom attributes, i.e. the array is
unpacked into separate components.
Here we test the function by rotating the previously loaded structure’s coordinates π/2
radians (90°), about the y axis.
10
Because the coordinates are modified internally on the
atom objects, there is no value to catch when the function is called. Note that the numeric
constant pi is not known to Python until we import it from the math module.
from math import pi
rotateStructureNumPy(struc, (0,1,0), pi/2)
The next example function moves on from rotation to do an arbitrary transformation of
the structural coordinates. Such a change is called an affine transformation in maths-
speak, and involves a transformation matrix, which may or may not be a rotation, and a
translation: a vector to specify movement from one location to another. The transformation
matrix may be used to stretch, reflect, shrink, enlarge and rotate the structure.
Firstly, we import the identity function from the NumPy module, which is used to
create an identity matrix, where the diagonal elements are one and the other elements are
zero. The identity matrix represents no transformation at all when used in multiplication,
just as normal multiplication by 1 doesn’t change a number. We use an identity matrix of
size three
11
as the default value of the transform variable, so that unless we state otherwise
no matrix transformation occurs. The default for the translation is a vector of zeros; so no
movement occurs.
from numpy import identity
def affineTransformStructure(structure, transform=identity(3),
translate=(0,0,0)):
atoms, coords = getAtomCoords(structure)
coords = dot(coords, transform.T)
coords = coords + translate
for index, atom in enumerate(atoms):
atom.coords = coords[index]
As with the rotation example we use the getAtomCoords() to get the list of atom objects
and the corresponding array of coordinates. The dot() call is used to apply the
transformation, i.e. by matrix multiplication. The translation, to move the coordinates, is
achieved by simply adding the translate vector, containing the numbers to add to the x, y
and z locations, to the coordinate array. Although the coords variable is a two-dimensional
numpy.array, and the translate vector is only one-dimensional (just three values, one for
each axis), the array addition repeats over all of the rows in the larger array. This addition
does a simple element-wise addition for the x, y and z numbers in each atom row;
numbers at equivalent columns in the two arrays are added together to make a new array
of the same size as the coordinate array.
Below, the function is tested on a loaded structure with a transformation that creates a
mirror image (the coordinates take on the opposite sign; negative to positive, positive to
negative) and a translation which is a movement of 10 units along the x and y axes. Using
the basic PDB writer function writeStructureToFile() that is available in the on-line
material we can write the modified structure, with transformed coordinates, to a new file
so that we can view the structure in graphics software to see the result of our handiwork.
Using graphical software to view molecular structures will be discussed later in this
chapter.
mirrorTransform = array([[-1,0,0], [0,-1,0], [0,0,-1]])
translate = array([10.0,10.0,0.0])
affineTransformStructure(struc, mirrorTransform, translate)
from Structures import writeStructureToFile
writeStructureToFile(struc, 'testTransform.pdb')
Do'stlaringiz bilan baham: |