For tf.layers.dropout()
the documentation for the training
arg is not clear to me.
The documentation states:
training: Either a Python boolean, or a TensorFlow boolean scalar tensor
(e.g. a placeholder). Whether to return the output in training mode
(apply dropout) or in inference mode (return the input untouched).
My interpretation is that depending on if training = True
or training = False
the dropout will be applied. However, it's not clear to me if True
or False
will apply the dropout (ie. which is in training mode). Given that this is an optional argument, I expected that tf.layers.dropout() would apply by default, but the default is False
which intuitively training=False
would suggest that the default is not training.
It appears that in order for tf.layers.dropout() to actually apply, one would need something like:
tf.layers.dropout(input, 0.5, training = mode == Modes.TRAIN)
This is not very obvious to me from the documentation as training
is an optional argument.
Does this appear to be the correct implementation of tf.layers.dropout
? Why would the training
flag just not automatically be tied to Modes.TRAIN
as the default and then need to be adjusted for other cases? The default being training=False
seems to be very misleading
The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. Inputs not set to 0 are scaled up by 1/(1 - rate) such that the sum over all inputs is unchanged.
trainable does not affect the layer's behavior, as Dropout does not have any variables/weights that can be frozen during training.) rate: Float between 0 and 1. Fraction of the input units to drop.
Yes they have the same functionality, dropout as a parameter is used before linear transformations of that layer (multiplication of weights and addition of bias). Dropout as layer can be used before an activation layer too.
Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting. The units that are kept are scaled by 1 / (1 - rate) , so that their sum is unchanged at training time and inference time.
Your interpretation of dropout()
and its training
argument is correct. However, an automatic Modes.TRAIN
check as you suggest is impossible. A mode is usually tied to an estimator model_fn()
as an optional parameter. Estimators constitute an higher-level abstraction and are not required in a TensorFlow model.
As to why TensorFlow designed their API with a false
default value, we can only speculate. An explanation would be that the layers
abstraction as a whole was intended to default to an inference mode, thereby explaining the dropout() training
default value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With