I can't understand the following code in the Deep MNIST for Experts tutorial.
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
What is the purpose of keep_prob: 0.5
when running train_step
?
The keep_prob
value is used to control the dropout rate used when training the neural network. Essentially, it means that each connection between layers (in this case between the last densely connected layer and the readout layer) will only be used with probability 0.5
when training. This reduces overfitting. For more information on the theory of dropout, you can see the original paper by Srivastava et al. To see how to use it in TensorFlow, see the documentation on the tf.nn.dropout()
operator.
The keep_prob
value is fed in via a placeholder so that the same graph can be used for training (with keep_prob = 0.5
) and evaluation (with keep_prob = 1.0
). An alternative way to handle these cases is to build different graphs for training and evaluation: look at the use of dropout in the current convolutional.py
model for an example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With