25
Hard problems
Contents
Solving hard problems
The Monte Carlo method
Simple Monte Carlo integration
Function minimisation by Monte Carlo
Metropolis-Hastings Monte Carlo
The travelling salesman problem
Simulated annealing
Simulated annealing of the travelling salesman
Function minimisation by simulated annealing
Particle dynamics simulation
Solving hard problems
This chapter deals with problems that cannot be readily solved with a straightforward,
deterministic algorithm. This includes problems that computer scientists would describe as
NP and not P (non-deterministic in polynomial time, but not solvable in polynomial time),
which is a way of saying that a problem is not efficiently solvable. Whether a problem is
straightforward to solve will depend on the complexity of the system. To take a classic
example, solving the gravitational equations for two orbiting masses, like the Sun and
Earth, is fairly easy, but adding more masses, e.g. the Moon, Mars etc., makes the problem
much harder. The basic equations of the system do not have to be complicated though.
Another famous (NP-hard) problem is the travelling salesman problem. Here the objective
is to find the shortest route on a tour that goes through all the places on the salesman’s list.
The problem is easy to describe, and it is easy to calculate the length of a solution (a
route), but the number of combinations grows very quickly with the number of places to
visit and so finding the best solution can be difficult. This is somewhat different to a
classic optimisation problem, e.g. finding the minimum of a function, where you can
typically follow gradients to home in on the answer.
When it comes to biological information there are many situations of this kind, because
biology frequently deals with large and interacting systems. For example, determining the
structure of a protein generally involves several thousands of atoms and in general we can
only ‘solve’ the structure with good experimental data (e.g. from high-resolution X-ray
crystallography); it is not sufficient to start with unstructured atoms and a physical model.
However, for a complex problem like this, and in a similar vein to measuring a travelling
salesman’s route, testing a given solution to see if it is better or worse can be
proportionately straightforward. Referring again to protein structures, there are many
methods that can quickly calculate the likelihood (or energy) of a structural model. An
obvious approach to working out how a protein folds into a structure would be to approach
the situation in reverse: generate an array of possible solutions and then use the easier
testing methods to see if any of these look reasonable. Unfortunately, the number of
possible combinations is enormous, and there simply wouldn’t be time to exhaustively
search through all possibilities. We can be smarter than this though, for example, taking a
smaller number of random solutions and only testing those, and then basing a subsequent
round of guesses on the best solutions from the previous round. This leads us into the
realm of Monte Carlo and Markov chains, which we describe below. Also, it is often
helpful to use heuristics, which for big problems may provide smart guesses about which
kinds of solution should never be tested, thus saving time. A heuristic for solving a protein
structure might be that the dihedral angles along its backbone have unstrained
conformations.
Do'stlaringiz bilan baham: |