Parallel computations: Convolutional networks pair well with
GPU training, particularly because the matrix-heavy calculations
of the convolutional layers are well suited to the structure of GPUs,
which are configured to carry out matrix calculations that are part of
graphics processing. Because of this, TCNs can train much faster than
RNNs.
•
Flexibility: TCNs can change input size, filter size, increase dilation
factors, stack more layers, etc. in order to easily be applied to various
domains.
•
Consistent gradients: Because TCNs are comprised of convolutional
layers, they backpropagate differently than RNNs do, and thus all
of the gradients are saved. RNNs have a problem called exploding
or vanishing gradients, where sometimes the calculated gradient is
either extremely large or extremely small, leading to the readjusted
weight to be too extreme of a change or to be a relatively nonexistent
change. To combat this, types of RNNs such as the LSTM, GRU, and
HF-RNN, were developed.
•
Do'stlaringiz bilan baham: |