Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add recurrent dropout to CuDNNGRU or CuDNNLSTM in Keras

One can apply recurrent dropout onto basic LSTM or GRU layers in Keras by passing its value as a parameter of the layer.

CuDNNLSTM and CuDNNGRU are LSTM and GRU layers that are compatible with CUDA. The main advantage is that they are 10 times faster during training. However they lack some of the beauty of the LSTM or GRU layers in Keras, namely the possibility to pass dropout or recurrent dropout values.

While we can add Dropout layers directly in the model, it seems we cannot do that with Recurrent Dropout.

My question is then the following : How to add recurrent dropout to CuDNNGRU or CuDNNLSTM in Keras ?

like image 684
Xema Avatar asked Dec 06 '18 16:12

Xema


People also ask

How do I create a recurrent Dropout?

One can apply recurrent dropout onto basic LSTM or GRU layers in Keras by passing its value as a parameter of the layer. CuDNNLSTM and CuDNNGRU are LSTM and GRU layers that are compatible with CUDA. The main advantage is that they are 10 times faster during training.

What is the difference between Dropout and recurrent Dropout?

Recurrent Dropout is a regularization method for recurrent neural networks. Dropout is applied to the updates to LSTM memory cells (or GRU states), i.e. it drops out the input/update gate in LSTM/GRU.

How do I Dropout in LSTM?

So, you use Dropout the same way you would use in any fully connected network. It drops a different group of features for each sample. A dropout as an argument to the LSTM has a lot of differences. It generates 4 different dropout masks, for creating different inputs for each of the different gates.

What is LSTM layer in keras?

Long Short-Term Memory Network or LSTM, is a variation of a recurrent neural network (RNN) that is quite effective in predicting the long sequences of data like sentences and stock prices over a period of time. It differs from a normal feedforward network because there is a feedback loop in its architecture.


2 Answers

I don't think we can have it as it is not even supported in the low level (i.e. cuDNN). From François Chollet creator of Keras:

Recurrent dropout is not implemented in cuDNN RNN ops. At the cuDNN level. So we can't have it in Keras.

The dropout option in the cuDNN API is not recurrent dropout (unlike what is in Keras), so it is basically useless (regular dropout doesn't work with RNNs).

Actually using such dropout in a stacked RNN will wreck training.

like image 169
today Avatar answered Sep 30 '22 14:09

today


You can use kernel_regularizer and recurrent_regularizer for prevent overfitting, i am using L2 regularizers and i am having good results.

like image 44
user3804427 Avatar answered Sep 30 '22 14:09

user3804427