Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras cifar10 example validation and test loss lower than training loss

I'm playing around with the cifar10 example from Keras which you can find here. I've recreated the model (i.e., not same file but everything else pretty much the same) and you can find it here.

The model is identical and I train the model for 30 epochs with 0.2 validation split on the 50,000 image training set. I'm not able to understand the result I get. My validation and testing loss is lesser than the training less (inversely, training accuracy is the lower compared to the validation and testing accuracy):

                      Loss       Accuracy
   Training          1.345          0.572
 Validation          1.184          0.596
       Test           1.19          0.596

Model Training and Validation loss and accuracy

Looking at the plot, I'm not sure why the training error starts increasing again so badly. Do I need to reduce the number of epochs I train for or maybe implement early stopping? Would a different model architecture help? If so, what would be good suggestions?

Thanks.

like image 298
shaun Avatar asked Mar 18 '17 19:03

shaun


2 Answers

Marcin's answer is good. There are also another few big reasons for high training error:

  • Dropout. Dropout layers are "on" in training, but they will be turned "off" (skipped) when doing validation and testing. This is automatic and it is by design. Dropout harms training error slightly. This is to be expected. Dropout layers are actually helpful in deep neural nets for regularization despite the additional training challenges. Most deep neural nets probably use dropout.

  • Learning rate too high. It's like throwing a coin into a glass. It can jump out when thrown too hard.

like image 83
Geoffrey Anderson Avatar answered Oct 14 '22 23:10

Geoffrey Anderson


This is a rare phenomenon but it happens from time to time. There are several reasons why this might be the case:

  • smaller datasets have smaller intrinsic variance so this means that your model properly captures patterns inside of your data and train error is greater simply because the inner variance of training set is greater then validation set.
  • Simple accident - this might also occur - that your split is suitable for such behaviour.
like image 33
Marcin Możejko Avatar answered Oct 15 '22 01:10

Marcin Możejko