I am using tensorflow to train a convnet with a set of 15000 training images with 22 classes. I have 2 conv layers and one fully connected layer. I have trained the network with the 15000 images and have experienced convergence and high accuracy on the training set.
However, my test set is experiencing much lower accuracy so I am assuming the network is over fitting. To combat this I added dropout before the fully connected layer of my network.
However, adding dropout has caused the network to never converge after many iterations. I was wondering why this may be. I have even used a high dropout probability (keep probability of .9) and have experienced the same results.
Well by making your keep dropout probability 0.9 it means there's 10% chance of that neuron connection getting off in each iteration .So for dropout also there should be an optimum value.
As in the above you can understand with the dropout we are also scaling our neurons. The above case is 0.5 drop out . If it's o.9 then again there will a different scaling .
So basically if it's 0.9 dropout keep probability we need to scale it by 0.9. Which means we are getting 0.1 larger something in the testing .
Just by this you can get an idea how dropout can affect . So by some probabilities it can saturate your nodes etc which causes the non converging issue..
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With