what exactly does 'tf.contrib.rnn.DropoutWrapper'' in tensorflow do? ( three citical questions)

Question

As I know, DropoutWrapper is used as follows

__init__(
cell,
input_keep_prob=1.0,
output_keep_prob=1.0,
state_keep_prob=1.0,
variational_recurrent=False,
input_size=None,
dtype=None,
seed=None
)

.

cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.DropoutWrapper(cell, output_keep_prob=0.5)
cell = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers, state_is_tuple=True)

the only thing I know is that it is use for dropout while training. Here are my three questions

What are input_keep_prob,output_keep_prob and state_keep_prob respectively? (I guess they define dropout probability of each part of RNN, but exactly where?)
Is dropout in this context applied to RNN not only when training but also prediction process? If it's true, is there any way to decide whether I do or don't use dropout at prediction process?
As API documents in tensorflow web page, if variational_recurrent=True dropout works according to the method on a paper "Y. Gal, Z Ghahramani. "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks". https://arxiv.org/abs/1512.05287 " I understood this paper roughly. When I train RNN, I use 'batch' not single time-series. In this case, tensorflow automatically assign different dropout mask to different time-series in a batch?

omer sagi · Accepted Answer

input_keep_prob is for the dropout level (inclusion probability) added when fitting feature weights. output_keep_prob is for the dropout level added for each RNN unit output. state_keep_prob is for the hidden state that is fed to the next layer.
You can initialize each of the above mentioned parameters as follows:

import tensorflow as tf
dropout_placeholder = tf.placeholder_with_default(tf.cast(1.0, tf.float32))
tf.nn.rnn_cell.DropoutWrapper(tf.nn.rnn_cell.BasicRNNCell(n_hidden_rnn),

input_keep_prob = dropout_placeholder, output_keep_prob = dropout_placeholder, 
state_keep_prob = dropout_placeholder)

The default dropout level will be 1 during prediction or anything else that we can feed during training.

The masking is done for the fitted weights rather than for the sequences that are included in the batch. As far as I know, it's done for the entire batch.

what exactly does 'tf.contrib.rnn.DropoutWrapper'' in tensorflow do? ( three citical questions)

Tags:

python-3.x

neural-network

tensorflow

bayesian

rnn

Eric

1 Answers

omer sagi

Recent Activity

Donate For Us

what exactly does 'tf.contrib.rnn.DropoutWrapper'' in tensorflow do? ( three citical questions)

Tags:

python-3.x

neural-network

tensorflow

bayesian

rnn

Eric

1 Answers

omer sagi

Related questions

Recent Activity

Donate For Us