Python Programming for Biology: Bioinformatics and Beyond


Representing structures as NumPy arrays



Download 7,75 Mb.
Pdf ko'rish
bet223/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   219   220   221   222   223   224   225   226   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Representing structures as NumPy arrays

The examples above used standard Python loops and didn’t make full use of the NumPy

array objects that store the atom coordinates. Accordingly, given that each atom position is

described  as  a  single  NumPy  array  (a  vector  of  x,  y  and  z  coordinates)  each  atomic

coordinate  can  be  placed  inside  another  larger  array,  to  form  a  two-dimensional  matrix

which  represents  the  whole  structure.  To  be  able  to  work  with  whole  2D  arrays  of




coordinates we must first transfer each atom’s coordinates from the data model into the 2D

array.  Once  defined,  the  2D  array  data  can  be  manipulated  without  having  to  repeatedly

loop through each atom in turn; most operations can work on all atoms at the same time.

Internally there will actually be looping going on, but we do not have to think about that

and can trust that it will occur in an efficient and expected manner.

The  following  function  loops  through  all  of  the  atoms  in  a  structure  and  places  the

coordinates  in  a  NumPy  array  object.  This  function  is  thus  general  and  can  be  used  any

time  we  wish  to  move  from  the  hierarchical  molecular  data  model  to  a  numeric  array

system.  Firstly,  we  import  the  required  functions  from  the  numpy  module:  array  which

will convert a Python list or tuple and dot for matrix multiplication.

from numpy import array, dot

def getAtomCoords(structure):

coords = []

atoms = []

for chain in structure.chains:

for residue in chain.residues:

for atom in residue.atoms:

coords.append(atom.coords)

atoms.append(atom)

return atoms, array(coords)

Inside the function we simply loop through the chains, residue and atoms and append

the  atom  coordinates  to  a  larger  list  of  coordinates.  We  also  collect  a  list  of  the  atom

objects  at  the  same  time,  so  that  we  will  know  which  coordinate  goes  with  which  atom;

obviously useful if we want to update the atom positions after changing the coordinates.

Finally,  we  return  the  list  of  atoms  and  the  coordinates,  taking  special  notice  of  the  fact

that the coordinates are converted from the normal Python list into a NumPy array object

(which here will be two-dimensional).

We will now use this array-generating function within another function that rotates an

input structure. This does exactly the same job as the rotation function described earlier,

but  the  innards  are  much  simpler  to  write  because  we  use  one  2D  array  for  all  the

coordinates and avoid writing loops. As before, the function takes an input angle and axis

to define a rotation matrix which is converted to an array. The getAtomCoords() function

just described is used to get the array object that represents all of the coordinates.

def rotateStructureNumPy(structure, axis=array([1,0,0]), angle=0):

rMatrix = array(getRotationMatrix(axis, angle))

atoms, coords = getAtomCoords(structure)

coords = dot(coords, rMatrix.T)

for index, atom in enumerate(atoms):

atom.coords = list(coords[index])



The  actual  rotation  operation  is  exceedingly  simple:  we  just  use  the  NumPy  dot()

operation to multiply the coordinate array by the rotation matrix;

9

this can be described as



applying a coordinate transformation. Finally, the updated coordinates are used to update

the coordinates of the atom objects. Note that we loop through the atom objects with the

enumerate()  function,  to  get  not  only  the  atom  but  also  an  index  position.  This  index  is

then used to identify the correct location in the coordinate array, given that the order of the

atom  list  exactly  matches  the  order  of  coordinates;  the  getAtomCoords()function

guarantees this. The position for each atom is updated in one operation by assigning one

coordinate  array  (containing  three  numbers)  to  the  three  atom  attributes,  i.e.  the  array  is

unpacked into separate components.

Here we test the function by rotating the previously loaded structure’s coordinates π/2

radians  (90°),  about  the  y  axis.

10

 Because  the  coordinates  are  modified  internally  on  the



atom objects, there is no value to catch when the function is called. Note that the numeric

constant pi is not known to Python until we import it from the math module.

from math import pi

rotateStructureNumPy(struc, (0,1,0), pi/2)

The next example function moves on from rotation to do an arbitrary transformation of

the  structural  coordinates.  Such  a  change  is  called  an  affine  transformation  in  maths-

speak,  and  involves  a  transformation  matrix,  which  may  or  may  not  be  a  rotation,  and  a

translation: a vector to specify movement from one location to another. The transformation

matrix may be used to stretch, reflect, shrink, enlarge and rotate the structure.

Firstly,  we  import  the  identity  function  from  the  NumPy  module,  which  is  used  to

create an identity matrix, where the diagonal elements are one and the other elements are

zero. The identity matrix represents no transformation at all when used in multiplication,

just as normal multiplication by 1 doesn’t change a number. We use an identity matrix of

size three

11

as the default value of the transform variable, so that unless we state otherwise



no matrix transformation occurs. The default for the translation is a vector of zeros; so no

movement occurs.

from numpy import identity

def affineTransformStructure(structure, transform=identity(3),

translate=(0,0,0)):

atoms, coords = getAtomCoords(structure)

coords = dot(coords, transform.T)

coords = coords + translate

for index, atom in enumerate(atoms):

atom.coords = coords[index]

As with the rotation example we use the getAtomCoords() to get the list of atom objects

and  the  corresponding  array  of  coordinates.  The  dot()  call  is  used  to  apply  the

transformation, i.e. by  matrix multiplication. The  translation, to move  the coordinates, is



achieved by simply adding the translate vector, containing the numbers to add to the x, y

and z locations, to the coordinate array. Although the coords variable is a two-dimensional

numpy.array,  and  the  translate  vector  is  only  one-dimensional  (just  three  values,  one  for

each axis), the array addition repeats over all of the rows in the larger array. This addition

does  a  simple  element-wise  addition  for  the  x,  y  and  z  numbers  in  each  atom  row;

numbers at equivalent columns in the two arrays are added together to make a new array

of the same size as the coordinate array.

Below, the function is tested on a loaded structure with a transformation that creates a

mirror  image  (the  coordinates  take  on  the  opposite  sign;  negative  to  positive,  positive  to

negative) and a translation which is a movement of 10 units along the x and y axes. Using

the  basic  PDB  writer  function  writeStructureToFile()  that  is  available  in  the  on-line

material we can write the modified structure, with transformed coordinates, to a new file

so that we can view the structure in graphics software to see the result of our handiwork.

Using  graphical  software  to  view  molecular  structures  will  be  discussed  later  in  this

chapter.

mirrorTransform = array([[-1,0,0], [0,-1,0], [0,0,-1]])

translate = array([10.0,10.0,0.0])

affineTransformStructure(struc, mirrorTransform, translate)

from Structures import writeStructureToFile

writeStructureToFile(struc, 'testTransform.pdb')




Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   219   220   221   222   223   224   225   226   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish