Keras - stateful vs stateless LSTMs

Tags:

I'm having a hard time conceptualizing the difference between stateful and stateless LSTMs in Keras. My understanding is that at the end of each batch, the "state of the network is reset" in the stateless case, whereas for the stateful case, the state of the network is preserved for each batch, and must then be manually reset at the end of each epoch.

My questions are as follows: 1. In the stateless case, how is the network learning if the state isn't preserved in-between batches? 2. When would one use the stateless vs stateful modes of an LSTM?

257

asked Sep 24 '16 21:09

vgoklani

1 Answers

I recommend you to firstly learn the concepts of BPTT (Back Propagation Through Time) and mini-batch SGD(Stochastic Gradient Descent), then you'll have further understandings of LSTM's training procedure.

For your questions,

Q1. In stateless cases, LSTM updates parameters on batch1 and then, initiate hidden states and cell states (usually all zeros) for batch2, while in stateful cases, it uses batch1's last output hidden states and cell sates as initial states for batch2.

Q2. As you can see above, when two sequences in two batches have connections (e.g. prices of one stock), you'd better use stateful mode, else (e.g. one sequence represents a complete sentence) you should use stateless mode.

BTW, @vu.pham said if we use stateful RNN, then in production, the network is forced to deal with infinite long sequences. This seems not correct, actually, as you can see in Q1, LSTM WON'T learn on the whole sequence, it first learns sequence in batch1, updates parameters, and then learn sequence on batch2.

122

answered Oct 19 '22 01:10

zodiac

Related questions
                            
                                What's the difference between scikit-learn and tensorflow? Is it possible to use them together?
                            
                                How to apply Drop Out in Tensorflow to improve the accuracy of neural network?
                            
                                Why do we name variables in Tensorflow?
                            
                                NotImplementedError: Layers with arguments in `__init__` must override `get_config`
                            
                                Tensor flow toggle between CPU/GPU
                            
                                Tensorflow serving No versions of servable <MODEL> found under base path
                            
                                Why can I not import Tensorflow.contrib I get an error of No module named 'tensorflow.python.saved
                            
                                pip3: command not found
                            
                                How do I get the weights of a layer in Keras?
                            
                                Using sparse matrices with Keras and Tensorflow
                            
                                LSTM Autoencoder
                            
                                How does TensorFlow name tensors?
                            
                                Why input is scaled in tf.nn.dropout in tensorflow?
                            
                                Tensorflow variable scope: reuse if variable exists
                            
                                How to convert numpy arrays to standard TensorFlow format?
                            
                                Keras + Tensorflow and Multiprocessing in Python
                            
                                How to manually create a tf.Summary()
                            
                                How to write a custom loss function in Tensorflow?
                            
                                Tensorflow Precision / Recall / F1 score and Confusion matrix
                            
                                TensorFlow ValueError: Cannot feed value of shape (64, 64, 3) for Tensor u'Placeholder:0', which has shape '(?, 64, 64, 3)'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras - stateful vs stateless LSTMs

Tags:

tensorflow

deep-learning

keras

lstm

vgoklani

People also ask

1 Answers

zodiac

Recent Activity

Donate For Us