Speed comparison of implementations
It is often difficult to compare the exact speed of implementations because the details of
how something is compiled can make a large performance difference. For example, with
the C implementation, you can get a difference in speed of a factor of four just by
choosing different optimisation options for the compiler. For the data given here we use
the GCC compiler version 4.6.3 with the -O2 flag,
It is also important to understand how the running time scales with the size of the
problem: in this case how big the input data arrays are. All the implementations scale in
the same basic way, with the slight exception of the NumPy implementation, where some
of the array calculations are proportionally more efficient for larger arrays; although once
large enough to be efficient the NumPy scaling will be about the same. Here we have to
consider the size of the inputs array (parameters ninputs and depth) and the size of the
spread array (parameters width and height) and the number of steps (parameter steps), in
addition to the size of the output self-organising map (parameters rows, cols and depth).
For testing we have taken ninputs to be rows × cols. In many problems it is possible that
the spread array is of a fixed size (e.g. 5×5) and the depth is of a fixed size (e.g. 3) and that
we really might consider varying just the three parameters rows, cols and steps. Looking
at the algorithm we see that the number of operations scales proportionately to steps, but
with the squares of both rows and cols. Typically the rows and cols are the same, such that
doubling each multiplies the number of operations by (approximately) 16. This is not a
fast algorithm.
There are three C-like implementations that can be tested and compared to the Python
versions. Firstly, we can test the C implementation with code itself written entirely in C
(and quite possibly this might be how such an implementation would have been used
originally). Secondly, we can test the C implementation using the Python wrapping around
it that we have provided above. Thirdly, we can test the Cython implementation. Not
surprisingly, these all come out at around the same speed. Note that all of these tests
depend on how the C code and the Python code are compiled. Naturally, the relative
values are specific to the actual program being tested, so the speed-up factors are only a
rough guide to the general situation.
The pure Python implementation is unsurprisingly the slowest, by more than two orders
of magnitude compared with the fastest.
Do'stlaringiz bilan baham: |