Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is dropout preventing convergence in Convolutional Neural Network?

I am using tensorflow to train a convnet with a set of 15000 training images with 22 classes. I have 2 conv layers and one fully connected layer. I have trained the network with the 15000 images and have experienced convergence and high accuracy on the training set.

However, my test set is experiencing much lower accuracy so I am assuming the network is over fitting. To combat this I added dropout before the fully connected layer of my network.

However, adding dropout has caused the network to never converge after many iterations. I was wondering why this may be. I have even used a high dropout probability (keep probability of .9) and have experienced the same results.

like image 906
Sam K Avatar asked Jul 31 '16 04:07

Sam K


Video Answer


1 Answers

Well by making your keep dropout probability 0.9 it means there's 10% chance of that neuron connection getting off in each iteration .So for dropout also there should be an optimum value.

This is taken from cs231 course

As in the above you can understand with the dropout we are also scaling our neurons. The above case is 0.5 drop out . If it's o.9 then again there will a different scaling .

So basically if it's 0.9 dropout keep probability we need to scale it by 0.9. Which means we are getting 0.1 larger something in the testing .

Just by this you can get an idea how dropout can affect . So by some probabilities it can saturate your nodes etc which causes the non converging issue..

like image 80
Shamane Siriwardhana Avatar answered Nov 06 '22 02:11

Shamane Siriwardhana