Python Programming for Biology: Bioinformatics and Beyond


Calculating root-mean-square deviation



Download 7,75 Mb.
Pdf ko'rish
bet229/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   225   226   227   228   229   230   231   232   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Calculating root-mean-square deviation

In order to help us interpret the coordinate superimposition, and because it is required in

subsequent  examples,  we  define  a  function  that  will  calculate  the  variation  in  the

coordinates  across  the  atom  positions  represented  in  our  arrays.  Strictly  this  is  the

mathematical  definition  often  called  root-mean-square  deviation  or  RMSD  for  short.  As

the  name  suggests  the  RMSD  value  is  calculated  by  taking  the  differences  in  coordinate

positions, squaring them, finding the average value and then the square root of this. This is

effectively the average distance of coordinate spread, although it should be noted that it is

an  average  of  squares,  not  the  distances  themselves,  and  so  is  biased  more  towards  the

larger deviations having more influence. (Note that the alignCoords() function can easily

be  adapted  to  calculate  RMSD  for  the  overall  transformation,  but  we  have  deliberately

avoided complicating the function further.)

The function that calculates the RMSDs takes an array of reference coordinates, a list of

the other coordinate array to compare with and a list of weights, so that each atom position

can be biased separately. Inside the function we initialise a list to hold the RMSD values

for  each  structure,  find  the  total  of  all  the  input  weights  (using  the  handy,  inbuilt  sum()

function)  and  initialise  an  empty  array  of  zeros  that  is  the  same  size  as  the  coordinate

arrays, which will hold the summation of positional differences.

def calcRmsds(refCoords, allCoords, weights):

rmsds = []

totalWeight = sum(weights)

totalSquares = zeros(refCoords.shape)

for coords in allCoords:

delta = coords-refCoords

squares = delta * delta



totalSquares += squares

sumSquares = weights*squares.sum(axis=1)

rmsds.append( sqrt(sum(sumSquares)/totalWeight) )

nStruct = len(allCoords)

atomRmsds = sqrt(totalSquares.sum(axis=1)/nStruct)

return rmsds, atomRmsds

The  bulk  of  the  function  involves  looping  through  the  list  of  coordinate  arrays  and

comparing  them  to  the  reference.  The  operations  in  this  loop  all  involve  whole  array

objects, and so when we add, subtract, multiply and divide the operations are applied to all

elements on the arrays at the same time. This is the advantage of using the NumPy arrays:

it simplifies the code and avoids having to write more loops. Accordingly delta is the array

of  all  coordinate  differences,  and  the  elements  of  this  whole  array  are  squared  to  give

squares.  The  square  coordinate  differences  are  added  to  the  array  of  totals  for  use  later.

The squared deviation for each atom is calculated as the sum of the square values along

the  spatial  axis  (i.e.  x

2

 +  y



2

 +  z


2

).  Here  this  is  done  using  the

squares.sum(axis=1)operation,  thus  we  get  the  total  for  each  atom  separately  and  form

another  array.  This  is  then  multiplied  by  the  weights  for  the  atoms  to  give  sumSquares,

which  represents  the  contribution  of  each  atom  to  the  coordinate  ‘deviation’.  Lastly,  the

sumSquares is summed over all atoms to give a single value, which is divided by the total

weight to find the average atomic square deviation for each structure. The square root of

this (hence root-mean-square deviation) is placed in the RMSD list.

Once  the  loop  is  complete  the  RMSD  values  for  the  individual  atoms  are  calculated.

Given that the  square differences for  the atoms were  added to  totalSquares for all of the

coordinate arrays (all structures) the average of these is then used to calculate each atom’s

RMSD  over  the  whole  set  of  conformations.  As  before,  this  is  all  done  with  NumPy

operations, to work on whole arrays without loops.


Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   225   226   227   228   229   230   231   232   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish