Figure 24.6. A schematic overview of support vector machines. Feature vectors
representing the training data are separated into different regions by a boundary line,
which in the original feature space can follow a complex path. The support vector machine
finds the boundary line by finding the linear hyperplane that best separates the data in a
higher number of dimensions. The decision hyperplane is in the middle of the widest
margin between the data classes, and this margin is itself determined by the support
vectors: data items which border the decision zone.
The particular algorithm that our example uses in its learning procedure is called
successive over-relaxation. This is a means of efficiently solving the linear equations that
govern the location of the decision hyperplane between two categories of data. The
objective of this algorithm is to define which of the feature vectors in the training data are
support vectors, and thus define the hyperplane direction. We will not discuss the
mathematical detail of this method or of SVMs in general here, we will merely give a
flavour of what is happening. However, the keen and more mathematically inclined
readers can investigate the specified literary references.
The support vector machine example given here will learn and predict classifications
between two categories of vector data, here encoded internally as +1 and −1 respectively.
Obviously there are often situations where there are more than just two categories that we
wish to predict. In these cases multiple support vector machines can be used to make
separate two-way decisions. Imagine that you have three categories of data A, B and C:
the first support vector machine might distinguish A from everything else, i.e. the other
category is B and C, and the second SVM will distinguish between the remaining B and
C. It should be noted, however, that where there is overlap between the different
categories, the order of two-way decisions may be important; in general you would try
different combinations and tend to make the most secure predictions first. Although we
will only be discussing an SVM that can be used for classification into two discrete
categories, there is a closely related method, support vector regression, which may be used
to predict continuous values. Here the vectors of training data have a range of different
numeric values and the support vectors are used to give a line of best fit to these in high-
dimensionality space. This line will yield predictions by interpolation; calculating the
position of a query along the line of known slope gives an estimate of the associated
numeric value.
Do'stlaringiz bilan baham: |