input0 = keras.layers.Input((32, 32, 3), name='Input0')
flatten = keras.layers.Flatten(name='Flatten')(input0)
relu1 = keras.layers.Dense(256, activation='relu', name='ReLU1')(flatten)
dropout = keras.layers.Dropout(1., name='Dropout')(relu1)
softmax2 = keras.layers.Dense(10, activation='softmax', name='Softmax2')(dropout)
model = keras.models.Model(inputs=input0, outputs=softmax2, name='cifar')
just to test whether dropout is working..
I set dropout rate to be 1.0
the state in each epoch should be freezed without any tuning to parameters
however the accuracy keep growing although i drop all the hidden nodes
what's wrong?
tf.keras.layers.Dropout(rate, noise_shape=None, seed=None, **kwargs) Applies Dropout to the input. The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. Inputs not set to 0 are scaled up by 1/ (1 - rate) such that the sum over all inputs is unchanged.
The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. Inputs not set to 0 are scaled up by 1/ (1 - rate) such that the sum over all inputs is unchanged. Note that the Dropout layer only applies when training is set to True such that no values are dropped ...
Dropout can be applied to input neurons called the visible layer. In the example below we add a new Dropout layer between the input (or visible layer) and the first hidden layer. The dropout rate is set to 20%, meaning one in 5 inputs will be randomly excluded from each update cycle.
The dropout rate is set to 20%, meaning one in 5 inputs will be randomly excluded from each update cycle. Additionally, as recommended in the original paper on Dropout, a constraint is imposed on the weights for each hidden layer, ensuring that the maximum norm of the weights does not exceed a value of 3.
Nice catch!
It would seem that the issue linked in the comment above by Dennis Soemers, Keras Dropout layer changes results with dropout=0.0, has not been fully resolved, and it somehow blunders when faced with a dropout rate of 1.0 [see UPDATE at the end of post]; modifying the model shown in the Keras MNIST MLP example:
model = Sequential()
model.add(Dense(512, activation='relu', use_bias=False, input_shape=(784,)))
model.add(Dropout(1.0))
model.add(Dense(512, activation='relu'))
model.add(Dropout(1.0))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=128,
epochs=3,
verbose=1,
validation_data=(x_test, y_test))
gives indeed a model being trained, despite all neurons being dropped, as you report:
Train on 60000 samples, validate on 10000 samples
Epoch 1/3
60000/60000 [==============================] - 15s 251us/step - loss: 0.2180 - acc: 0.9324 - val_loss: 0.1072 - val_acc: 0.9654
Epoch 2/3
60000/60000 [==============================] - 15s 246us/step - loss: 0.0831 - acc: 0.9743 - val_loss: 0.0719 - val_acc: 0.9788
Epoch 3/3
60000/60000 [==============================] - 15s 245us/step - loss: 0.0526 - acc: 0.9837 - val_loss: 0.0997 - val_acc: 0.9723
Nevertheless, if you try a dropout rate of 0.99, i.e. replacing the two dropout layers in the above model with
model.add(Dropout(0.99))
then indeed you have effectively no training taking place, as it should be the case:
Train on 60000 samples, validate on 10000 samples
Epoch 1/3
60000/60000 [==============================] - 16s 265us/step - loss: 3.4344 - acc: 0.1064 - val_loss: 2.3008 - val_acc: 0.1136
Epoch 2/3
60000/60000 [==============================] - 16s 261us/step - loss: 2.3342 - acc: 0.1112 - val_loss: 2.3010 - val_acc: 0.1135
Epoch 3/3
60000/60000 [==============================] - 16s 266us/step - loss: 2.3167 - acc: 0.1122 - val_loss: 2.3010 - val_acc: 0.1135
UPDATE (after comment by Yu-Yang in OP): It seems as a design choice (deal link now, see update below) not to do anything when the dropout rate is equal to either 0 or 1; the Dropout
class becomes effective only
if 0. < self.rate < 1.
Nevertheless, as already commented, a warning message in such cases (and a relevant note in the documentation) would arguably be a good idea.
UPDATE (July 2021):
There have been some changes since Jan 2018 when the answer was written; now, under the hood, Keras calls tf.nn.dropout
, which does not seem to allow for dropout=1
(source).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With