Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras LSTM: dropout vs recurrent_dropout

I realize this post is asking a similar question to this.

But I just wanted some clarification, preferably a link to some kind of Keras documentation that says the difference.

In my mind, dropout works between neurons. And recurrent_dropout works each neurons between timesteps. But, I have no grounding for this whatsoever.

The documentation on the Keras webite is not helpful at all.

like image 846
Oliver Crow Avatar asked Apr 20 '18 11:04

Oliver Crow


People also ask

Should dropout be used with LSTM?

After the LSTM you have shape = (None, 10) . So, you use Dropout the same way you would use in any fully connected network. It drops a different group of features for each sample.

What is Recurrent_dropout in LSTM?

Recurrent Dropout is a regularization method for recurrent neural networks. Dropout is applied to the updates to LSTM memory cells (or GRU states), i.e. it drops out the input/update gate in LSTM/GRU.

Why is recurrent dropout used?

Recurrent dropout technique is used to improve the performance and the generalization power of the recurrent networks. Recurrent dropout is used to fight overfitting in the recurrent layers. Recurrent dropout helps in regularization of recurrent neural networks.

Where is the dropout layer in LSTM?

The dropout proba- bility started at 0.5 and linearly decreased to 0.0 after 8 epochs, after which no dropout was used. In [8], dropout was applied to LSTMs at the point where the input comes from the previ- ous layer (this is equivalent to our ”Location 2” below).


1 Answers

Keras LSTM documentation contains high-level explanation:

dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.

recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state.

But this totally corresponds to the answer you refer to:

Regular dropout is applied on the inputs and/or the outputs, meaning the vertical arrows from x_t and to h_t. ...

Recurrent dropout masks (or "drops") the connections between the recurrent units; that would be the horizontal arrows in your picture.

If you're interested in details on the formula level, the best way is to inspect the source code: keras/layers/recurrent.py, look for rec_dp_mask (recurrent dropout mask) and dp_mask. One is affecting the h_tm1 (the previous memory cell), the other affects the inputs.

like image 134
Maxim Avatar answered Sep 17 '22 14:09

Maxim