Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to interpret tf.layers.dropout training arg

For tf.layers.dropout() the documentation for the training arg is not clear to me.

The documentation states:

training: Either a Python boolean, or a TensorFlow boolean scalar tensor
      (e.g. a placeholder). Whether to return the output in training mode
      (apply dropout) or in inference mode (return the input untouched).

My interpretation is that depending on if training = True or training = False the dropout will be applied. However, it's not clear to me if True or False will apply the dropout (ie. which is in training mode). Given that this is an optional argument, I expected that tf.layers.dropout() would apply by default, but the default is False which intuitively training=False would suggest that the default is not training.

It appears that in order for tf.layers.dropout() to actually apply, one would need something like:

tf.layers.dropout(input, 0.5, training = mode == Modes.TRAIN)

This is not very obvious to me from the documentation as training is an optional argument.

Does this appear to be the correct implementation of tf.layers.dropout? Why would the training flag just not automatically be tied to Modes.TRAIN as the default and then need to be adjusted for other cases? The default being training=False seems to be very misleading

like image 403
reese0106 Avatar asked Feb 13 '18 19:02

reese0106


People also ask

What does TF keras layers dropout do?

The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. Inputs not set to 0 are scaled up by 1/(1 - rate) such that the sum over all inputs is unchanged.

Are dropout layers trainable?

trainable does not affect the layer's behavior, as Dropout does not have any variables/weights that can be frozen during training.) rate: Float between 0 and 1. Fraction of the input units to drop.

Does dropout layer have parameters?

Yes they have the same functionality, dropout as a parameter is used before linear transformations of that layer (multiplication of weights and addition of bias). Dropout as layer can be used before an activation layer too.

What is the use of dropout in Tensorflow?

Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting. The units that are kept are scaled by 1 / (1 - rate) , so that their sum is unchanged at training time and inference time.


1 Answers

Your interpretation of dropout() and its training argument is correct. However, an automatic Modes.TRAIN check as you suggest is impossible. A mode is usually tied to an estimator model_fn() as an optional parameter. Estimators constitute an higher-level abstraction and are not required in a TensorFlow model.

As to why TensorFlow designed their API with a false default value, we can only speculate. An explanation would be that the layers abstraction as a whole was intended to default to an inference mode, thereby explaining the dropout() training default value.

like image 77
sgc Avatar answered Oct 04 '22 18:10

sgc