Tensorflow LSTM Dropout Implementation

Tags:

How specifically does tensorflow apply dropout when calling tf.nn.rnn_cell.DropoutWrapper() ?

Everything I read about applying dropout to rnn's references this paper by Zaremba et. al which says don't apply dropout between recurrent connections. Neurons should be dropped out randomly before or after LSTM layers, but not inter-LSTM layers. Ok.

The question I have is how are the neurons turned off with respect to time?

In the paper that everyone cites, it seems that a random 'dropout mask' is applied at each timestep, rather than generating one random 'dropout mask' and reusing it, applying it to all the timesteps in a given layer being dropped out. Then generating a new 'dropout mask' on the next batch.

Further, and probably what matters more at the moment, how does tensorflow do it? I've checked the tensorflow api and tried searching around for a detailed explanation but have yet to find one.

Is there a way to dig into the actual tensorflow source code?

248

asked Feb 27 '17 14:02

beeCwright

1 Answers

You can check the implementation here.

It uses the dropout op on the input into the RNNCell, then on the output, with the keep probabilities you specify.

It seems like each sequence you feed in gets a new mask for input, then for output. No changes inside of the sequence.

190

answered Sep 19 '22 13:09

Robert Lacok

Related questions
                            
                                Can multiple tensorflow inferences run on one GPU in parallel?
                            
                                tflite quantized inference very slow
                            
                                Error on tensorflow cannot import name 'export_saved_model'
                            
                                conditional graph in tensorflow and for loop that accesses tensor size
                            
                                Bug in TensorFlow reduce_max for negative infinity?
                            
                                How to use a tensorflow model extracted from a trained keras model
                            
                                Random number generator differs between tensorflow 1.0.1 and 0.12.1
                            
                                Debugging batching in Tensorflow Serving (no effect observed)
                            
                                Recurrentshop and Keras: multi-dimensional RNN results in a dimensions mismatch error
                            
                                What does freezing a graph in TensorFlow mean?
                            
                                TensorFlow: read a frozen model, add operations, then save to a new frozen model
                            
                                Correct way to get output of intermediate layer in Keras model?
                            
                                how to check both training/eval performances in tensorflow object_detection
                            
                                Android TensorFlow Lite interpreter: How to fix "DataType error: cannot resolve DataType of java.lang.Float"
                            
                                Keras: Custom layer without inputs
                            
                                How to add additional classes to a pre-trained object detection model and train it to detect all of the classes (pre-trained + new)?
                            
                                Speed up the initial TensorFlow startup
                            
                                Why is this tensorflow training taking so long?
                            
                                How to control memory while using Keras with tensorflow backend?
                            
                                Tensorflow - How to implement hyper parameters random search?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow LSTM Dropout Implementation

Tags:

tensorflow

lstm

time-series

dropout

beeCwright

People also ask

1 Answers

Robert Lacok

Recent Activity

Donate For Us