Beginning Anomaly Detection Using

Problems with transfer learning

Download 26,57 Mb.

Pdf ko'rish

bet	170/283
Sana	12.07.2021
Hajmi	26,57 Mb.
	#116397

1 ... 166 167 168 169 170 171 172 173 ... 283

Bog'liq
Beginning Anomaly Detection Using Python-Based Deep Learning

Figure 7-1.

Problems with transfer learning: First, let’s define what transfer

learning is. Transfer learning is when a model has been trained for

one particular task (classifying vehicles for example), and has the last

layer(s) taken out and retrained completely so that the model can be

used for a new classification task (classifying animals, for example).

In computer vision, there are some really powerful models, such

as the inception-v3 model, that have been trained on powerful

GPUs for quite some time in order to achieve the performances

that they do. Instead of training our own CNN from the ground up

(and most of us don’t have the GPU hardware or the time to spend

in long training an extremely deep model like inception-v3), we

can simply take inception-v3, for example, which is really good

at extracting features out of images, and train it to associate the

features that it extracts with a completely new set of classes. This

process takes a lot less time since the weights in the entire network

are already well optimized, so you’re only concerned with finding

the optimal weights for the layers you are retraining.

That’s why transfer learning is such a valuable process; it allows us

to take a pretrained, high-performance model and simply retrain

the last layer(s) with our hardware and teach the model a new

classification task (for CNNs).

Going back to TCNs, the model might be required to remember

varying levels of sequence history in order to make predictions.

If the model did not have to take in as much history in the old task

to make predictions, but in the new task it had to receive even

more/less history to make predictions, that would cause issues

and might lead the model to perform poorly.

In a one-dimensional convolutional layer, we still have parameter

k to determine the

size of our kernel, or filter. The way the convolutional layer works is pretty similar to the

two-dimensional convolutional layer you looked at in Chapter

, but we are only dealing

with vectors in this case.

Here’s an example of what the one-dimensional convolutional operation looks like.

Assuming an input vector defined as in Figure

7-1

Chapter 7 temporal Convolutional networks

260

and a filter initialized as in Figure

7-2

,

the output of the convolutional layer is calculated as shown in Figure

7-3

, Figure

7-4

,

Figure

7-5

, and Figure

7-6

.

10 5 15 20 10 20

x =

Figure 7-1. A vector x defined with these corresponding values. This is the input

vector

1 0.2 0.1

Filter Weights

Download 26,57 Mb.

Do'stlaringiz bilan baham:

1 ... 166 167 168 169 170 171 172 173 ... 283