You have just seen an example of the process of training in the feedforward backpropagation network,
described in relation to one hidden layer neuron and one input neuron. There were a few vectors that were
shown and used, but perhaps not made easily identifiable. We therefore introduce some notation and describe
Notation
Let us talk about two matrices whose elements are the weights on connections. One matrix refers to the
interface between the input and hidden layers, and the second refers to that between the hidden layer and the
output layer. Since connections exist from each neuron in one layer to every neuron in the next layer, there is
a vector of weights on the connections going out from any one neuron. Putting this vector into a row of the
matrix, we get as many rows as there are neurons from which connections are established.
Let M
1
and M
2
be these matrices of weights. Then what does M
1
[i][j] represent? It is the weight on the
connection from the
ith input neuron to the
jth neuron in the hidden layer. Similarly,
M
2
[i][j] denotes the
weight on the connection from the
ith neuron in the hidden layer and the
jth output neuron.
Next, we will use x, y, z for the outputs of neurons in the input layer, hidden layer, and output layer,
respectively, with a subscript attached to denote which neuron in a given layer we are referring to. Let P
denote the desired output pattern, with p
i
as the components. Let m be the number of input neurons, so that
according to our notation, (
x1,
x2, …,
xm) will denote the input pattern. If
P has, say,
r components, the
output layer needs r neurons. Let the number of hidden layer neurons be n. Let ²
h
be the learning rate
parameter for the hidden layer, and ²
o2
, that for the output layer. Let ¸ with the appropriate subscript represent
the threshold value or bias for a hidden layer neuron, and Ä with an appropriate subscript refer to the
threshold value of an output neuron.
Let the errors in output at the output layer be denoted by ejs and those at the hidden layer by t
i
’s. If we use a ”
prefix of any parameter, then we are looking at the change in or adjustment to that parameter. Also, the
thresholding function we would use is the sigmoid function, f(x) = 1 / (1 + exp(–x)).
Previous Table of Contents Next
Copyright ©
IDG Books Worldwide, Inc.
C++ Neural Networks and Fuzzy Logic:Preface
Notation
120