A Kohonen map in Python
The Python function below encodes a simple self-organising map. It makes extensive use
of array functionality imported from NumPy, which in this instance reduces calculation
time for large maps and, perhaps more importantly for this book, improves brevity by
avoiding loops. Although some readers may find the use of vector/array operations
disconcerting to start with, the approach does focus attention on the higher-level
operations and reasoning that underpin the method.
Firstly, we import a couple of modules to give access to array operations and the
exponential function. Then we define the function that takes a list of inputs (feature
vectors defined as numpy.array objects) and something we have called spread, which
represents the strength of connectivity between close nodes on the grid; this is a square
array (a matrix) of weights where the centre represents the amount of pull from the input
data on the best-matching grid node during learning and the off-centre values determine
how much this influence spreads to adjacent grid positions. The size input is the number
of rows and columns in the grid map and steps is simply the number of learning cycles
that will be performed.
import numpy
from math import exp
def selfOrganisingMap(inputs, spread, size, steps=1000):
We extract the number of rows and columns for the grid from the input size and then
determine vecLen, which represents the size of the feature vectors (e.g. three for a colour
containing red, green and blue values). These numbers are then used to make the initial
version of the map grid, which is named somap. This grid naturally has the required
number of rows and columns, but also has a depth axis so that each grid location contains
a number of feature values (a feature vector), hence the input axis sizes for the map are
nRows, nCols, vecLen.
nRows, nCols = size
vecLen = len(inputs[0])
somap = numpy.random.rand(nRows, nCols, vecLen)
The next thing to initialise is an array that determines the influence (pull) of an input
data vector on a region of the map grid. The input spread array gives the basic form of
this; however, this needs to be applied to all of the feature values in the map, however
deep the feature vectors; all the values along the vector are moved closer to the
corresponding values from the best-match input. The spread array is only one value deep,
but the map is vecLen deep, hence we define the influence array, which uses the values
from spread, repeated vecLen times. Building a deep array of this kind upfront means we
do not have to loop through the features to apply the influence weighting. The influence
array is simple to create; by using the ‘*’ operation on a list the spread array is replicated,
7
and the dstack()
8
function stacks the arrays on top of one another along the depth axis. The
infWidth is calculated to find the radius of the influence, as needed later in the calculation.
Then to improve clarity and speed we define the function makeMesh, which is simply a
renaming of the cryptic numpy.ix_ function; this takes lists of row and column indices and
creates a ‘mesh’ where the indices intersect, and hence allows the extraction of a sub-
matrix from a larger matrix.
influence = numpy.dstack([spread]*vecLen) # One for each feature
infWidth = (len(spread)-1) // 2
makeMesh = numpy.ix_ # Ugly
With the initialisation done the code now goes through the main learning steps. In
Python 2, the xrange() function is used instead of range() because the number of steps may
be fairly large and the latter would make a large list in memory, not just yield a step
number, which is all we need. In Python 3, the range() function behaves like xrange() in
Python 2. For each step the decay variable is calculated as the exponent of the proportion
we have gone through all steps; this is then used to diminish the pull of the input data on
the map vectors so that initially they can change a large amount but later stabilise. Then
for each step we begin a loop through the array of input data vectors.
for s in range(steps): # xrange in Python 2
decay = exp(-s/float(steps))
for vector in inputs:
Inside the loop, for each input, we calculate the difference of the current map to the
vector. This is an array operation in compact NumPy notation where same vector is taken
away from each of the feature vectors across all of the rows and columns of the grid; the
result is an array the same size as the grid, with a difference vector for each grid position.
This difference array is then squared in an element-wise manner (not matrix
multiplication), then the square differences are added up along the depth (feature) axis.
The result dist2 is effectively a distance (squared) for each point on the grid to the current
input, to give a measure of similarity.
diff = somap-vector
diff2 = diff*diff
dist2 = diff2.sum(axis=2)
The argmin() function quickly finds the index in the array representing the minimum
vector distance (most similar grid position); unfortunately, however, the index given back
is the position in the array as if it were flattened into one long list. Thus, we then have to
perform more calculations to determine what the column and the row of this index are; the
row is found using integer division (the number of complete rows covered by the index),
and the column is a remainder of the index from the start of the row.
index = dist2.argmin()
row = index // nRows
col = index % nRows
Given the row and column of the best-matching feature vector in the grid, we determine
lists of rows and columns which cover the area of the map that will be ‘pulled’ towards the
input. These lists are a range of values that goes infWidth positions either side of the best-
matching row or column. Note the second argument for range() has one added to it,
because the second argument is a limit that is not included on the output. The row and
column numbers (x and y) are subject to a modulo ‘%’ operation
9
so that if values from
the range fall off the edge of the grid (less than zero or greater than last position) the
calculated remainder continues the row/column on the opposite side. Effectively this joins
the edges of the grid so that it wraps round in a continuous way; it needn’t be done like
this, but edge effects are helpfully removed.
rows = [x % nRows for x in range(row-infWidth, row+1+infWidth)]
cols = [y % nCols for y in range(col-infWidth, col+1+infWidth)]
Finally in the loop, the rows and columns are used to make a mesh, where they
intersect, that is then used to extract sub-matrices for the grid map and difference array.
The sub-matrix part of the map is the region that will be influenced by the input vector.
The sub-matrix of the map is moved towards the input feature vector by subtracting the
corresponding sub-matrix of the difference array (reducing the difference between map
and input). Note that the sub-matrix of difference values (diff[mesh]) is scaled by two
things: the influence of the input vector on the grid points near to the best-match position
and the scaling factor that decays during the course of the learning. The influence array is
the same size and shape as the sub-matrix, but decay is a simple number. The scaling
means the amount of adjustment of the map grid towards the input vector diminishes both
with the in-map distance from the best-match point and over the course of repeating the
procedure. Finally, after the loop the grid array somap (of mapped feature vectors) is
passed back.
mesh = makeMesh(rows,cols)
somap[mesh] -= diff[mesh] * influence * decay
return somap
To test the function we first define a numpy.array which will determine how far across
the self-organising map (grid) the input influence spreads. The centre (scoring 1.0 here)
represents the best-match position in the map.
spread = numpy.array([[0.0, 0.10, 0.2, 0.10, 0.0],
[0.1, 0.35, 0.5, 0.35, 0.1],
[0.2, 0.50, 1.0, 0.50, 0.2],
[0.1, 0.35, 0.5, 0.35, 0.1],
[0.0, 0.10, 0.2, 0.10, 0.0]])
As a test we will make some random input; an array 400 (20×20) vectors of length 3.
Each feature vector will represent a different colour in terms of red, green and blue values
between zero and one.
rows, cols = 20, 20
testInput = numpy.random.rand(rows * cols, 3)
The self-organising map is run for 100 iterations, and will arrange the values on a
20×20 grid. Note that the input is of length rows × cols so that we have exactly one data
point (input colour vector) for each map position. This needn’t be the case, but is good to
use for this example because it allows a 1-to-1 mapping of inputs to grid positions; thus
the operation effectively performs shuffling of the inputs to make a 2D map of colour
similarity.
som = selfOrganisingMap(testInput, spread, (rows, cols), 100)
We can then view the results of the organised map (look at the feature vectors for each
node in the grid) by converting the numpy.array into a real colour image that we can view
on screen. The conversion of an array of numbers to a displayable, graphical image is
discussed in
Chapter 18
.
from PIL import Image
colors = som*255
colors = colors.astype(numpy.uint8)
img1 = Image.fromarray(colors, 'RGB')
img1.save('som.png', 'PNG')
img1.show()
Do'stlaringiz bilan baham: |