Particle dynamics simulation
We now move on to dynamics simulations that are somewhat analogous to simulated
annealing. The method shares the concept of going to a new solution to the problem based
on a prior one and also a notion of cooling. However, dynamics simulations move away
from using purely random numbers to determine the search trajectory. Rather, we create
what is called a force field that will push and pull the items into more optimal locations.
Naturally, this sort of approach only works if you have some idea of the kind of influences
on the items of data, e.g. a physical model. The advantages of having a simulated
computer model of a physical system are numerous, but as far as optimisation problems
are concerned they can be efficient search methods that are more directed than Monte
Carlo. Also, we may benefit from knowing that the real physical system actually finds a
solution (e.g. a protein folds) so in theory a model stands at least some chance of also
finding a solution. Here we will use a very simple model system, of repulsive atoms
bonded in a chemical structure, to determine an extended conformation for a molecule.
The molecule will not be large and the force field will be very simplistic (definitely not the
proper physics equations), but it will give you the general idea and will be good enough to
give a reasonable picture of the molecule.
The force field used in the Python example will consist of a simple bond model and a
simple repulsive force between atoms. The bonds will pull atoms to a single set distance
(in reality bond lengths vary and we have double bonds and bond orbitals etc.), and will
attract or repel accordingly when atoms are too close or too distant. The repulsive force
will be calculated between non-bonded atoms and will diminish with an inverse power
relation to the distance between atoms; close atoms will repel strongly but distant atoms
will have little effect. The test molecule itself is the common pharmaceutical paracetamol
(also known as acetaminophen). This is complex enough to present a challenge, but not
too unwieldy for this book. A Python dictionary is used to contain the chemical bond
information, i.e. which atoms are bound to which. The bond dictionary for paracetamol is
given below. Here we have used a system where each atom is identified by a unique name.
There is a key in the dictionary for each atom and the values for that atom are the names
of the other atoms with which it shares chemical bonds. An alternative here would be to
use just index numbers, one for each atom, so that we didn’t have to decide upon the
names of atoms (which follow no special rule). However, with names it is perhaps easier
to visualise the setup.
chemBonds = {'H1': ['O1',], 'O1': ['H1', 'C1'], 'C1': ['O1', 'C2', 'C6'],
'C2': ['C1', 'H2', 'C3'], 'H2': ['C2'], 'C3': ['C2', 'H3',
'C4'],
'H3': ['C3'], 'C4': ['C3', 'N7', 'C5'], 'C5': ['C4', 'H5',
'C6'],
'H5': ['C5'], 'C6': ['C5', 'H6', 'C1'], 'H6': ['C6'],
'N7': ['C4', 'H7', 'C8'], 'H7': ['N7'], 'C8': ['N7', 'O8',
'C9'],
'O8': ['C8'], 'C9': ['C8', 'H9a', 'H9b', 'H9c'], 'H9a':
['C9'],
'H9b': ['C9'], 'H9c': ['C9']}
The main dynamics function is written to take a chemical bond dictionary, the number
of dynamics steps and the length of chemical bonds to aim for. It should be noted that the
scale of any measurements is not explicitly stated, but for a good physical model the bond
lengths would be of the order of 10
−10
metres. Firstly, we make some extra NumPy
imports that are required and then define the function name.
from numpy import zeros, sqrt
def chemParticleDynamics(bondDict, numSteps=5000, bondLen=1.0,
timeStep=0.01):
The list of atoms is determined as the keys of the dictionary that contains the bond
information. The number of atoms is the length of this list and the atomic coordinates are
initialised with random values, using a uniform distribution over a reasonable range. Note
that the last argument to uniform, when defining the initial coordinates, defines the
number of rows and columns to yield values for. Here we want a vector of length three (x,
y, z) for each atom. If developing this method further it may be useful to use custom
Python classes to define Atom objects, as illustrated in
Chapter 8
, rather than using lists.
atoms = list(bondDict.keys())
numAtoms = len(atoms)
atomCoords = uniform(-10.0, 10.0, (numAtoms, 3))
The list indices is simply the numbers of all the atoms. This is defined upfront so that
we don’t have to recreate it repeatedly in the search steps. Likewise n is the floating point
version of the number of steps, and is used to generate the temperature factor in the
annealing schedule.
indices = range(numAtoms)
n = float(numSteps)
The main loop goes through the specified number of search steps and for each of these
the effective temperature temp is defined as in other examples.
for step in range(numSteps):
temp = exp(-step/n)
For the given step, with its temperature factor, the dynamics simulation requires that we
go through all atoms in the molecule and calculate the effect of the other atoms according
to the force field. The for loop goes through all the indices for the atoms, so that we can
get hold of the atom’s name (atom) and the coordinates. Note that we can ignore the first
atom in this loop (hence [1:]), which doesn’t need to move; the others will move around it.
The variable velocity is defined initially as a zero vector and will represent the change that
is made to the atom’s position along the x, y and z axes, once we have considered the
forces.
for i in indices[1:]:
atom = atoms[i]
coords = atomCoords[i]
velocity = zeros(3, float)
For the primary atom (at index i) we then need to consider all of the other atoms that it
interacts with, according to the force field. Hence we define a second loop and
corresponding atom index j, skipping the loop if we come across the primary atom.
for j in indices:
if i == j:
continue
Then comparing each of the secondary atoms with the primary (index i with index j)
delta is calculated as the difference vector between their coordinate positions. Then the
sum of this squared is dist2, the distance between atoms squared.
delta = coords – atomCoords[j]
delta2 = delta * delta
dist2 = delta2.sum()
Next we apply the actual force field. Because we are keeping the demonstration simple
we are using only two terms and these are mutually exclusive. Proper molecular force
fields will have many more, complex terms for things like electrostatic charge, van der
Waals force, bond angle etc. Anyhow, here we test whether the primary and secondary
atoms are directly bound. If they are bound the name of one atom will be in the bonded
list, which is extracted from the bondDict using the other atom as a look-up key. If they
are bound a simple ‘force’ value is calculated as the difference between the input bond
length and the distance between atoms.
8
Note that we only calculated the square distance
up to this point and only now calculate the square root when we need to. The force value
could also be scaled separately to other forces, which in general is a way of balancing the
different, competing terms. Also, any scaling dictates how much movement can be gained
in each step, although here we use a general scale factor timeStep to control the step size.
The scale factors for the two force terms in this example just happen to work well when
they are the same (1.0).
bound = bondDict[atoms[j]]
if atom in bound:
force = bondLen - sqrt(dist2)
If the atoms do not share a direct chemical bond we apply the repulsive term, which is
inversely proportional to the distance between atoms raised to the fourth power (the square
distance squared). Using this power law is quite arbitrary, but works nicely here.
else:
force = 1.0 / (dist2*dist2)
With the force variable defined for a bond or repulsion, we do a check to make sure it is
not too great. This is important because a force field can potentially produce some very
large numbers (e.g. repelling atoms that are very close) which would otherwise throw an
atom too far away for practical purposes, and may also lead to numerical Python errors.
The bounded force value is then multiplied by timeStep, temp and delta, and added to the
total for the velocity. The temperature factor is part of the annealing process discussed
previously and the time step determines how much movement can occur for each iteration.
Longer time steps will move things more quickly but can lead to atoms overshooting
optimal positions. The delta represents the vector between the two atoms, and thus applies
the force in the correct direction, i.e. the force is between atoms.
force = min(max(-200.0, force), 200.0)
velocity += delta * force * temp * timeStep
After all the interacting atoms have been considered the velocity variable will represent
the residual push and pull on the primary atom. At an optimised location all the competing
forces will balance out and this will be a zero vector, but otherwise the residual will move
the atom. Accordingly, velocity is added to the coordinates for the atom.
atomCoords[i] += velocity
After all of the dynamics steps the atom coordinates are centred (for ease of inspection),
by subtracting the average position, and then returned from the function.
center = atomCoords.mean(axis=0)
atomCoords = atomCoords-center
return atomCoords
Testing is simply a matter of using the chemical bonding data defined earlier (the
connection topology) with the particle dynamics function. The resulting atom coordinates
may be printed out, or even displayed in a graphical interface. Naturally, it is only the
relative values of the atom coordinates that are important, rather than their exact values. If
all the positions are rotated or translated by the same amounts it is still the same structure.
coords = chemParticleDynamics(chemBonds)
print(coords)
Do'stlaringiz bilan baham: |