Python Programming for Biology: Bioinformatics and Beyond



Download 7,75 Mb.
Pdf ko'rish
bet403/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   399   400   401   402   403   404   405   406   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

A Python neural network

The feed-forward neural network example in Python has been split into two functions: one

that  makes  predictions  and  one  that  does  the  training.  It  would  also  be  possible  to

construct this neural network using classes (custom kinds of Python objects), and this may

hold  certain  advantages,  like  the  ability  to  make  adapted  subclasses.  However,  using

functions makes it simpler to describe the principles of what is happening.

The first function is called neuralNetPredict, which takes some input data for the first

layer  of  network  nodes,  applies  the  first  weighted  connections  and  trigger  functions  to

pass  the  signal  to  the  hidden  layer  of  nodes  and  then  applies  the  second  weights  and

triggers to generate some output. This is used both during the training of the network, to

set  up  the  connection  weights,  and  to  make  predictions  on  unseen  data.  Initially  some

mathematical functions are imported from the NumPy library, so that we can express the

operations concisely as arrays and matrices.

11

from numpy import array, tanh, zeros, ones, random, sum, append



Then  we  define  the  function  name  and  its  input  arguments:  an  array  of  input  features

(inputVec) and two matrices that represent the connection weights. The matrix weightsIn

represents  the  strength  of  connection  between  the  input  nodes  (which  include  the  bias

node  we  describe  below)  and  the  hidden  nodes.  Likewise,  weightsOut  represents  the

strengths  between  the  hidden  and  the  output  nodes.  The  weights  are  represented  as

matrices  so  that  the  rows  correspond  to  a  set  of  nodes  in  one  layer  and  the  columns

represent  the  set  of  nodes  in  the  other  layer,  to  connect  everything  in  one  layer  to

everything  in  the  other.  For  example,  if  the  network  has  four  input,  five  hidden  and  two

output nodes, then weightsIn will be a 4×5 matrix, and weightsOut will be a 5×2 matrix.

Inside  the  function  the  first  step  is  to  define  the  signalIn  vector  for  the  network.  This  is

simply a copy of the input features array with an extra value of 1.0 appended to the end.



This extra, fixed input is what is known as a bias node, and is present so the baseline (the

level without meaningful signal) of an input can be adjusted. This gives more flexibility at

the  trigger  function  used  for  the  hidden  layer  of  nodes,  which  improves  learning.  The

weight  matrices  must  be  of  the  right  size  to  account  for  the  bias  node,  and  although

weights from the bias node are still adjusted by training they are naturally not affected by

the input data. A bias connection going to each hidden node enables the input to that node

to be offset, effectively shifting the centre of the trigger function about so that it can better

distinguish  the  input  values;  the  upshot  of  this  is  that  the  programmer  doesn’t  have  to

worry about centring input feature values (e.g. making their mean values zero).

def neuralNetPredict(inputVec, weightsIn, weightsOut):

signalIn = append(inputVec, 1.0) # input layer

prod = signalIn * weightsIn.T

sums = sum(prod, axis=1)

signalHid = tanh(sums) # hidden layer

prod = signalHid * weightsOut.T

sums = sum(prod, axis=1)

signalOut = tanh(sums) # output layer

return signalIn, signalHid, signalOut

The  main  operation  of  the  function  involves  multiplying  the  input  vector,  element  by

element, with the columns of the first matrix of weights. As a result of the training process

we describe later, the weight matrix is arranged so that there is a column for each of the

hidden  nodes.  Given  we  want  to  apply  the  input  signal  to  each  hidden  node,  we  use  the

transpose  (.T)  of  the  weight  matrix  so  that  columns  are  switched  with  rows  for  the

multiplication. This is a requirement because element multiplication of a one-dimensional

NumPy array with a two-dimensional array is done on a per-row basis. Next we calculate

the summation of the weighted input down each column (axis=1), so we get one value for

each hidden node. Then to get the signal that comes from the hidden layer we calculate the

hyperbolic tangent of the sums, applying the sigmoid-shaped trigger function to each. This

whole  operation  is  then  repeated  in  the  same  manner  for  going  from  the  hidden  layer  to

the output layer; we apply weights to the signal vector, sum over columns and apply the

trigger function. The final output vector is the prediction from the network. At the end of

the function we return all the signal vectors, and although only the output values are useful

in making predictions the other vectors are used in training the network.

The second Python function for the feed-forward neural network is a function to train it

by the back-propagation method, to find an optimal pair of weight matrices. The objective

is  to  minimise  error  between  the  output  vectors  predicted  by  the  network  and  the  target

values (known because this is training data). Here the error is calculated as the sum of the

squared differences, but other methods may be more appropriate in certain situations. The

function is defined and takes the training data as an argument, which is expected to be an

array containing pairs of items: an input feature vector and the known output vector. The

next  argument  is  the  number  of  nodes  in  the  hidden  layer;  the  size  of  input  and  output

layers need not be specified because they can be deduced from the length of the input and

output vectors used in training. The remaining arguments relate to the number of training



steps  (cycles  over  the  data)  that  will  be  made,  a  value  for  the  learning  rate  that  governs

how strongly weights are adjusted and a momentum factor that allows each training cycle

to use a fraction of the adjustments that were used in the previous cycle, which makes for

smoother  training.  In  practice  the  learning  rate  and  momentum  factor  can  be  optimised,

but the default values are generally a fair start.

def neuralNetTrain(trainData, numHid, steps=100, rate=0.5, momentum=0.2):

Within the function a few values are initialised. The numbers of nodes in the input and

output  layers  are  extracted  from  the  size  of  the  first  item  (index  zero)  of  training  data,

noting that the number of inputs is then increased by one to accommodate the bias node.

The error value which we aim to minimise starts as None, but will be filled with numeric

values later.

numInp = len(trainData[0][0])

numOut = len(trainData[0][1])

numInp += 1

minError = None

Next we make the initial signal vectors as arrays of the required sizes (a value comes

from  each  node)  with  all  elements  starting  out  as  1  courtesy  of  numpy.ones().  The  input

will be the feature vector we pass in and the output will be the prediction.

sigInp = ones(numInp)

sigHid = ones(numHid)

sigOut = ones(numOut)

The initial weight matrices are constructed with random values between −0.5 and 0.5,

with  the  required  number  of  rows  and  columns  in  each.  The  random.random  function

makes matrices of random numbers in the range 0.0 to 1.0, but by taking 0.5 away (from

every element) we shift this range. This particular range is not a strict requirement, but is a

fairly good general strategy; too small and the network can get stuck, but too large and the

learning is stifled. The best weight matrices, which is what we are going to pass back from

the function at the end of training, start as these initial weights but then improve.

wInp = random.random((numInp, numHid))-0.5

wOut = random.random((numHid, numOut))-0.5

bestWeightMatrices = (wInp, wOut)

The  next  initialisation  is  for  the  change  matrices,  which  will  indicate  how  much  the

weight  matrices  differ  from  one  training  cycle  to  the  next.  These  are  important  so  that

there  is  a  degree  of  memory  or  momentum  in  the  training;  strong  corrections  to  the

weights will tend to keep going and help convergence.

cInp = zeros((numInp, numHid))

cOut = zeros((numHid, numOut))

The final initialisation is for the training data: pairs of input and output vectors. This is

done  to  convert  all  of  the  vectors  into  numpy.array  data  type,  thus  allowing  the  training

data to be input as lists and/or tuples. We simply loop through the data, extract each pair,

convert to arrays and then put the pair back in the list at the appropriate index (x).



for x, (inputs, knownOut) in enumerate(trainData):

trainData[x] = (array(inputs), array(knownOut))

With  everything  initialised,  we  can  then  begin  the  actual  network  training,  so  we  go

through  the  required  number  of  loops  and  in  Python  2  use  xrange()  so  that  a  large  list

doesn’t have to be created. Note we don’t use a while loop to check for convergence on

the error because a neural network is not always guaranteed to converge and sometimes it

can  stall  before  convergence.  For  each  step  we  shuffle  the  training  data,  which  is  often

very  important  for  training;  without  this  there  is  a  bias  in  the  way  the  weights  get

optimised. After the shuffle, the error starts at zero for the cycle.

for step in range(steps): # xrange() in Python 2

random.shuffle(trainData) # Important

error = 0.0

Next  we  loop  through  all  of  the  training  data,  getting  the  input  feature  vector  and

known  output  for  each  example.  We  then  use  the  current  values  of  the  weight  matrices,

with the prediction function described above, to calculate the signal vectors. Initially the

output signal vector (the prediction) will be quite different from the known output vector,

but this will hopefully improve over time.

for inputs, knownOut in trainData:

sigIn, sigHid, sigOut = neuralNetPredict(inputs, wInp, wOut)

Given  the  neural  network  signals  that  come  from  the  current  estimates  for  weight

matrices  we  now  apply  the  back-propagation  method  to  try  to  reduce  the  error  in  the

prediction.  Thus  we  calculate  the  difference  between  the  known  output  vector  and  the

signal output from the neural network. This difference is squared and summed up over all

the features (diff is an array) before being added to the total error for this cycle.

diff = knownOut - sigOut

error += sum(diff * diff)

Next we work out an adjustment that will be made to the output weights, to hopefully

reduce  the  error.  The  adjustment  is  calculated  from  the  gradient  of  the  trigger  function.

Because this example uses a hyperbolic tangent function, the gradient at the signal value is

one  minus  the  signal  value  squared  (differentiate  y  =  tanh(x)  and  you  get  1  −  tanh

2

(x)



which  equals  1  −  y

2

).  The  signal  gradient  multiplied  by  the  signal  difference  then



represents the change in the signal before the trigger function, which can be used to adjust

the weight matrices. Note that all these mathematical operations are performed on all the

elements of whole arrays at once, courtesy of NumPy.

gradient = ones(numOut) - (sigOut*sigOut)

outAdjust = gradient * diff

The same kind of operation is repeated for the hidden layer, to find the adjustment that

will  be  made  for  the  input  weight  matrix.  Again,  we  calculate  a  signal  difference  and  a

trigger  function  gradient  and  multiply  them  to  get  an  adjustment  for  what  goes  into  the

trigger function. However, this time we can’t compare output vectors, so instead we take

the  array  of  signal  adjustments  just  calculated  and  propagate  them  back  through  the




network. Thus the signal difference for the hidden layer is calculated by taking the signal

adjustment  for  the  output  later  and  passing  it  through  the  output  weight  matrix,  i.e.

backwards through the last layer.

diff = sum(outAdjust * wOut, axis=1)

gradient = ones(numHid) - (sigHid*sigHid)

hidAdjust = gradient * diff

With  the  adjustments  calculated  it  then  remains  to  make  the  changes  to  the  weight

matrices,  and  hopefully  get  an  improvement  in  the  error.  The  weight  change  going  from

hidden  to  output  layers  requires  that  we  calculate  a  change  matrix  (the  same  size  as  the

weights),  hence  we  take  the  vector  of  adjustments  and  the  vector  of  hidden  signals  and

combine  them;  each  row  of  adjustments  (one  per  output)  is  multiplied  by  a  column  of

signals  (one  per  hidden  node)  to  get  the  new  weights.  Note  how  we  use  the  reshape()

function to convert the array of signals, a single row, into a column vector; it is tipped on

its side so that the multiplication can be made to generate a matrix with rows and columns.

# update output

change = outAdjust * sigHid.reshape(numHid, 1)

wOut += (rate * change) + (momentum * cOut)

cOut = change

In the same manner the changes are made to the input weight matrix.

# update input

change = hidAdjust * sigIn.reshape(numInp, 1)

wInp += (rate * change) + (momentum * cInp)

cInp = change

Then finally in the training cycle, we see if the minimum error has been improved on.

During the first cycle the minimum error is None, so we always fill it with the first real

calculated error value in that case. Each time we find a new minimum error we record the

best  weight  matrices  (so  far)  by  taking  copies  of  the  current  versions,  using  the  handy

.copy() function of NumPy arrays. Then finally at the end of all of the training cycles, the

best weight matrices are returned.

if (minError is None) or (error < minError):

minError = error

bestWeightMatrices = (wInp.copy(), wOut.copy())

print("Step: %d Error: %f" % (step, error))

return bestWeightMatrices

We can test the feed-forward neural network by using some test training data. As a very

simple example, the first test takes input vectors with a pair of numbers which are either

one or zero. The output corresponds to the ‘exclusive or’ (XOR) logic function: the output

is 1 if either of the inputs is 1, but not both. This test data is a list of [input, output] pairs.

Note that even though the output is just a single number it is nonetheless represented as a

list with a single item.

data = [[[0,0], [0]],

[[0,1], [1]],




[[1,0], [1]],

[[1,1], [0]]]

The number of hidden nodes used here is simply stated as 2, but in practical situations

several  values  will  need  to  be  tried,  and  their  performance  evaluated.  Then  we  run  the

training function in the data to estimate the best weight matrices for the neural network.

wMatrixIn, wMatrixOut = neuralNetTrain(data, 2, 1000)

The output weight matrices can then be run on test data for evaluation. At the very least

they  ought  to  do  a  reasonable  job  at  predicting  the  output  signals  for  the  training  set,

although in practice these really ought to be for data that has not been used in the training.

for inputs, knownOut in data:

sIn, sHid, sOut = neuralNetPredict(array(inputs), wMatrixIn, wMatrixOut)

print(knownOut, sOut[0])




Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   399   400   401   402   403   404   405   406   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish