Improving speed
When it comes to making Python programs faster there are a few general points to
consider before going through specifics. We can think of program execution time in terms
of the number of operations performed and how long individual operations take. Hence, to
improve speed we try to reduce the number of operations and/or perform faster individual
operations. Also, an important aspect of this is how the number of operations scales with
the size of the problem. For example, the time taken could stay roughly constant, increase
linearly or increase in an exponential manner (to name only a few possibilities) as the
number of data items increases. Naturally the scaling depends on the algorithm in
question. Hence, if things seem too slow it is worth questioning whether a particular
algorithmic approach is a smart one, especially if the number of data items is large and the
speed scales poorly with the size of this data. As a very simple example, albeit somewhat
contrived, to add all the integer numbers between a and b you could do:
s = 0
for x in range(a, b+1):
s += x
or
s = sum(range(a, b+1))
or
s = (b+1-a) * (a+b) / 2.0
The first two examples involve an operation for each item in a list (albeit explicitly or
from sum()), so their execution time is roughly proportional to the length of the list. The
last formula involves a small, fixed number of operations, which is much quicker for large
ranges, although you could argue that the code is not as easy for a novice to understand.
Whether analyses of this kind are worthwhile will depend on the context of the operation
within the program as a whole. For example, if you were doing this kind of summation
many times, in a long loop, the overall gains from optimisation would be significant, but if
it is only done occasionally or if you’re only adding a few values the difference is
probably negligible. On the whole we advise opting for clarity if an operation is not a
performance bottleneck. Sometimes it is easy to spot where the bottlenecks are in your
code and you can simply time how long something takes, as illustrated below. However, if
it is not easy, there are a number of profiler programs, such as the inbuilt Python module
cProfile,
2
that can be used to inspect how long the various operations take and thus show
you where to focus attention.
Even when keeping the same basic Python algorithm, some surprisingly simple
programmatic changes or rearrangements can yield speed differences. This chapter will
subsequently go through many more specific examples of these, but common things to
look out for are: avoiding unnecessary function calls (which are relatively slow in
Python), taking care to remove unnecessary operations from long loops and making good
use of Python collections. For the examples below we are using usingTimer(), as
described in
Chapter 5
, as a decorator to wrap the test code in a function that times how
long the operation takes to run. Firstly, we have an example which uses the math module
to take the square root for a range of numbers, the result of which is added to a total:
import math
@usingTimer
def testFunc(x):
y = 0
for i in range(x):
y += math.sqrt(i)
return y
Testing using a large number the result takes just over 2 seconds:
3
testFunc(10000000)
>>> Function call took 2.046932 seconds
>>> 21081849486.439312
Now a small modification is done to obtain sqrt at the start,
4
rather than accessing it
repeatedly from the math module inside the loop, using dot notation:
from math import sqrt
@usingTimer
def testFunc2(x):
y = 0
for i in range(x):
y += sqrt(i)
return y
Testing reveals that this new version is about a third of a second quicker:
testFunc2(10000000)
>>> Function call took 1.672865 seconds
>>> 21081849486.439312
Rewriting the function to use a list comprehension, thus avoiding explicit Python loops,
is quicker still. Note that in this version we have also avoided constructing large lists. In
Python 2 we would use the iterable xrange() instead of the list-creating range(), but in
Python 3 the range() function has the same behaviour as xrange() in Python 2. And by
using round parentheses for the list comprehension we make a list generator object, rather
than an actual large, filled list.
@usingTimer
def testFunc3(x):
yList = (sqrt(i) for i in range(x)) # in Python use xrange(x)
return sum(yList)
testFunc3(10000000)
>>> Function call took 1.360537 seconds
>>> 21081849486.439312
Another common tactic to make Python run faster is to investigate whether there are
any fast, pre-constructed libraries that you can import (as Python modules) to do the job.
An obvious example would be to use NumPy for mathematical work involving arrays.
Here the imported code is written in a fast compiled language underneath and avoids
having to go through Python loops. An example of the gain from using numpy.array
objects is illustrated below, and the improvement is impressive, over nine times faster than
the original:
from numpy import arange, sqrt
@usingTimer
def testFunc4(x):
yArray = sqrt(arange(x))
return yArray.sum()
testFunc4(10000000)
>>> Function call took 0.221356 seconds
>>> 21081849486.439312
If there is a requirement to make a Python program even faster, above and beyond what
we can achieve with changes to coding style, then we might consider parallelisation (using
multiple processors) or write a module in a fast compiled language like C. We cover these
more advanced topics in
Chapter 27
.
Do'stlaringiz bilan baham: |