Figure 27.2. An overview of how a module written in C may be wrapped so it can
be called from Python. To increase the speed at which calculation-intensive Python
programs run we can write fast modules in the compiled language C and encapsulate them
so that they can be called like normal Python functions. To do this the C module must be
written in such a way as to accept input as Python objects and also to send back any return
values as Python objects. This interface can be written by directly accessing the Python
data structures using Python’s own C library or by using a system like Cython, which can
automatically convert between the two systems. Otherwise, once the data is routed to C
data structures a fast C routine can be constructed in the normal way.
More recently there has been less reason to go down the C route. For much of the
numerical work you can use the NumPy and SciPy modules. Of course, the Python
modules they provide are actually also wrappers around C, to make it quick, so the library
authors have simply done the hard work for you. However, it’s possible that you require
the use of some algorithm that is not easily expressed in functionality that NumPy and
SciPy provide. After all these libraries naturally provide mathematical and array
operations that are general. Fortunately, there are various ways to write C-like code in
Python itself, and this provides another way of optimising code for speed. In this chapter
we discuss one of these, called Cython. Cython is actually a separate language, although it
is very similar to Python and unmodified Python code will normally work directly without
any alteration. What Cython offers is a way to mix Python code with some elements of the
C language and then to automatically convert this friendly language into pure C code,
which is then compiled in the normal C manner, usually to make a Python-compatible
module.
There is still at least one good reason why you might want to interface directly to C
code, and that is if you have an existing extensive library that you do not want to have to
rewrite in NumPy or Cython. It is also possible that a C version of the code is just that
much faster that it is worth writing. Unfortunately it is difficult to know for sure which
way is actually best until the C code is written. Note that one of the main problems with
using C code is that it has to be compiled for each type of machine architecture
5
on which
you want it to run. The same is actually true of Python and NumPy, but generally someone
else has already done the compiling for you. This is an advantage of Python code, and
should not be underestimated, especially if you are intending to distribute your code to
other people.
In this chapter we use the self-organising map from
Chapter 24
as an example, which
should be looked at before reading this chapter in detail. In that chapter a solution to the
problem was implemented by using NumPy arrays. Here we will re-implement it in plain
(non-NumPy) Python, C and Cython, to give a comparison between the various
approaches.
Do'stlaringiz bilan baham: |