Table 27.1. Comparing Python, Cython and C module execution speeds. Speed
test results for various implementations of the self-organising map; testing 100
iterations of a 100 by 100 input matrix. The Python version used during testing was
2.7.3 and the C compiler used was GGC version 4.6.3 with the –O2 option.
Implementation
Average run time (seconds)
Speed-up factor
C
82
227
C+Python
85
219
Cython
81
230
Python+NumPy
658
28
Python
18616
1
The ‘
ctypes
’ module
Python has a module, called ctypes, which lets Python code interact directly with C data
types and call C functions. This is not necessarily much easier than writing Python/C
wrapper code, as was done above, but it does save having to compile the code, although
you still have to be able to understand the C specification of whatever functionality you
are using. It is also important to understand who (the C world or the Python world) ‘owns’
dynamically allocated memory, otherwise it could potentially cause problems with
memory leaks, or crashes from using memory that has been freed. For complete
documentation and further discussion of the many issues that need to be considered, see
the documentation page for the ctypes module on the Python website,
http://www.python.org
. We will illustrate a few examples here using the C runtime library,
because that is available on most computer systems. The functionality we use as an
example is already available in Python, but it nonetheless shows the beginnings of how the
ctypes module works.
The ctypes module has an object, cdll, for loading dynamically linked libraries, and so
the first thing to do is to import that:
from ctypes import cdll
On Windows machines the C runtime library is available directly as an attribute of cdll.
On other platforms, like Linux and OSX, there is instead a function, LoadLibrary(). This
has one argument, which is the file name of the library (so including the suffix). The file
name is platform specific. But there is a utility function, find_library(), which allows the
user to find the file name of the standard libraries such as the C runtime library. To
determine if we are under a Windows operating system, we can use sys.platform, and if it
starts with “win” we assume we are using Windows.
import sys
if sys.platform[:3] == "win":
libc = cdll.msvcrt
else:
from ctypes.util import find_library
fileName = find_library("c") # "c" for C runtime library
libc = cdll.LoadLibrary(fileName)
Once we have the handle to the C runtime library we can call its available functionality.
For example, to call the C time() function we just do:
print("time = %d" % libc.time(None))
This prints the number of seconds since 1 January 1970. Here the argument None to
time() represents the C null pointer.
The standard C print function, printf(), is also available, and this illustrates how to deal
with C types. Only the following restricted set of Python data types can be passed directly
to C functions: None, integers (and longs, in Python 2) and bytes objects (and strings in
Python 2). Other types need converting. For example, for Python floating point numbers
there are three corresponding C data types: ‘float’, ‘double’ and ‘long double’.
Respectively, these have their own conversion functions: c_float(), c_double() and
c_longdouble(). Accordingly, to print a Python float as a C ‘double’, to three decimal
places, we can do:
from ctypes import c_double
x = 3.14159
libc.printf(b"x = %.3f\n", c_double(x))
In Python 3 the ‘b’ converts the string to a bytes object, in Python 2 it is not needed but
it works (for Python 2.6 and 2.7). In C the printf() function returns the number of
characters written, and so the above printf() gives two lines of output when called from the
Python prompt:
8
x = 3.142
10
In addition to the standard types, C also allows user-defined data types. These are just a
list of attributes, and for each attribute a type. The ctypes module has a Python class called
Structure, and by subclassing this you effectively get the Python version of a C data type.
In this class you specify the attribute _fields_ (only one underscore before and after, not
two). This is a list and each element of the list contains a 2-tuple, where the first element
of the tuple is the name of the associated datum, and the second element of the tuple is its
type. The names are your choice, but it is good practice to use the C names, which can be
found from reading the C documentation.
We will illustrate use of Structure with a calendar example. For working with calendar
time there is a C data type, ‘struct tm’, which stores the second, month, hour, day and so
on. You have to read the C documentation to know exactly how it is stored in order to be
able to use it via ctypes. For ‘struct tm’ the data type of all the attributes is the C ‘int’ so
here we use c_int. This leads to:
from ctypes import Structure, c_int
class TimeStruct(Structure):
_fields_ = [ \
('tm_sec', c_int), # seconds
('tm_min', c_int), # minutes
('tm_hour', c_int), # hours
('tm_mday', c_int), # day of the month
('tm_mon', c_int), # month
('tm_year', c_int), # year
('tm_wday', c_int), # day of the week
('tm_yday', c_int), # day in the year
('tm_isdst', c_int) # daylight saving time
]
In the following example we will fetch the current time in seconds, using the libc.time()
function mentioned above, and then use the function libc.localtime(), which will take an
input time in seconds and convert it into a TimeStruct. Here we need to set the return data
type of localtime(), otherwise Python will interpret it as an integer (the default). This is
done by setting the attribute restype of the function.
from ctypes import POINTER , c_long, byref
libc.localtime.restype = POINTER(TimeStruct)
When we fetch the time, one technical detail is that time() returns a Python integer but
we need it to be a c_long, so we convert:
t = libc.time(None)
t = c_long(t)
Then the time in seconds has to be passed into the localtime() function using byref(),
which requires that the calling argument should be passed by reference rather than by
value.
resultPtr = libc.localtime(byref(t))
From reading the C documentation we know what the return data type is: it is what C
calls a ‘pointer’, which in this case is to the TimeStruct object. Here, this can be thought
of as a Python list of length one, where the one and only element is the actual TimeStruct
object. Hence we take index 0:
result = resultPtr[0]
Finally, we print out the result using TimeStruct attributes. The year starts at 1900, so
we add that to turn it into the usual convention. And the month starts at 0 rather than 1, so
we also add that.
print("day = %04d %02d %02d, time = %02d:%02d:%02d" %
(result.tm_year+1900, result.tm_mon+1, result.tm_mday,
result.tm_hour, result.tm_min, result.tm_sec))
1
There are many books for learning C: for example, Kelley, A., and Pohl, I. (1997). A
Book on C: Programming in C. Addison Wesley.
Do'stlaringiz bilan baham: |