Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Backpropagation through time in stateful RNNs

If I use a stateful RNN in Keras for processing a sequence of length N divided into N parts (each time step is processed individually),

  1. how is backpropagation handled? Does it only affect the last time step, or does it backpropagate through the entire sequence?
  2. If it does not propagate through the entire sequence, is there a way to do this?
like image 382
Alex Avatar asked Feb 07 '23 03:02

Alex


2 Answers

The back propagation horizon is limited to the second dimension of the input sequence. i.e. if your data is of type (num_sequences, num_time_steps_per_seq, data_dim) then back prop is done over a time horizon of value num_time_steps_per_seq Take a look at

https://github.com/fchollet/keras/issues/3669

like image 141
Partha Ghosh Avatar answered Feb 12 '23 17:02

Partha Ghosh


There are a couple things you need to know about RNNs in Keras. At default the parameter return_sequences=False in all recurrent neural networks. This means that at default only the activations of the RNN after processing the entire input sequence are returned as output. If you want to have the activations at every time step and optimize every time step seperately, you need to pass return_sequences=True as parameter (https://keras.io/layers/recurrent/#recurrent).

The next thing that is important to know is that all a stateful RNN does is remember the last activation. So if you have a large input sequence and break it up in smaller sequences (which I believe you are doing), the activation in the network is retained in the network after processing the first sequence and therefore affects the activations in the network when processing the second sequence. This has nothing to do with how the network is optimized, the network simply minimizes the difference between the output and the targets you give.

like image 38
Semi Avatar answered Feb 12 '23 15:02

Semi