Backpropagation through time in stateful RNNs

Question

If I use a stateful RNN in Keras for processing a sequence of length N divided into N parts (each time step is processed individually),

how is backpropagation handled? Does it only affect the last time step, or does it backpropagate through the entire sequence?
If it does not propagate through the entire sequence, is there a way to do this?

Partha Ghosh · Accepted Answer

The back propagation horizon is limited to the second dimension of the input sequence. i.e. if your data is of type (num_sequences, num_time_steps_per_seq, data_dim) then back prop is done over a time horizon of value num_time_steps_per_seq Take a look at

https://github.com/fchollet/keras/issues/3669

Semi · Answer

There are a couple things you need to know about RNNs in Keras. At default the parameter return_sequences=False in all recurrent neural networks. This means that at default only the activations of the RNN after processing the entire input sequence are returned as output. If you want to have the activations at every time step and optimize every time step seperately, you need to pass return_sequences=True as parameter (https://keras.io/layers/recurrent/#recurrent).

The next thing that is important to know is that all a stateful RNN does is remember the last activation. So if you have a large input sequence and break it up in smaller sequences (which I believe you are doing), the activation in the network is retained in the network after processing the first sequence and therefore affects the activations in the network when processing the second sequence. This has nothing to do with how the network is optimized, the network simply minimizes the difference between the output and the targets you give.

Backpropagation through time in stateful RNNs

Tags:

neural-network

backpropagation

keras

recurrent-neural-network

Alex

2 Answers

Partha Ghosh

Semi

Recent Activity

Donate For Us

Backpropagation through time in stateful RNNs

Tags:

neural-network

backpropagation

keras

recurrent-neural-network

Alex

2 Answers

Partha Ghosh

Semi

Related questions

Recent Activity

Donate For Us