Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Higher validation accuracy, than training accurracy using Tensorflow and Keras

I'm trying to use deep learning to predict income from 15 self reported attributes from a dating site.

We're getting rather odd results, where our validation data is getting better accuracy and lower loss, than our training data. And this is consistent across different sizes of hidden layers. This is our model:

for hl1 in [250, 200, 150, 100, 75, 50, 25, 15, 10, 7]:     def baseline_model():         model = Sequential()         model.add(Dense(hl1, input_dim=299, kernel_initializer='normal', activation='relu', kernel_regularizer=regularizers.l1_l2(0.001)))         model.add(Dropout(0.5, seed=seed))         model.add(Dense(3, kernel_initializer='normal', activation='sigmoid'))          model.compile(loss='categorical_crossentropy', optimizer='adamax', metrics=['accuracy'])         return model      history_logs = LossHistory()     model = baseline_model()     history = model.fit(X, Y, validation_split=0.3, shuffle=False, epochs=50, batch_size=10, verbose=2, callbacks=[history_logs]) 

And this is an example of the accuracy and losses: Accuracy with hidden layer of 250 neurons and the loss.

We've tried to remove regularization and dropout, which, as expected, ended in overfitting (training acc: ~85%). We've even tried to decrease the learning rate drastically, with similiar results.

Has anyone seen similar results?

like image 223
Jasper Avatar asked May 15 '17 12:05

Jasper


People also ask

Why my validation accuracy is higher than training accuracy?

The training loss is higher because you've made it artificially harder for the network to give the right answers. However, during validation all of the units are available, so the network has its full computational power - and thus it might perform better than in training.

Can validation loss be higher than training loss?

At times, the validation loss is greater than the training loss. This may indicate that the model is underfitting. Underfitting occurs when the model is unable to accurately model the training data, and hence generates large errors.

Why training accuracy is less than validation accuracy?

If your model's accuracy on your testing data is lower than your training or validation accuracy, it usually indicates that there are meaningful differences between the kind of data you trained the model on and the testing data you're providing for evaluation.

How do you get a higher validation accuracy?

One of the easiest ways to increase validation accuracy is to add more data. This is especially useful if you don't have many training instances. If you're working on image recognition models, you may consider increasing the diversity of your available dataset by employing data augmentation.


Video Answer


2 Answers

This happens when you use Dropout, since the behaviour when training and testing are different.

When training, a percentage of the features are set to zero (50% in your case since you are using Dropout(0.5)). When testing, all features are used (and are scaled appropriately). So the model at test time is more robust - and can lead to higher testing accuracies.

like image 180
yhenon Avatar answered Oct 01 '22 12:10

yhenon


You can check the Keras FAQ and especially the section "Why is the training loss much higher than the testing loss?".

I would also suggest you to take some time and read this very good article regarding some "sanity checks" you should always take into consideration when building a NN.

In addition, whenever possible, check if your results make sense. For example, in case of a n-class classification with categorical cross entropy the loss on the first epoch should be -ln(1/n).

Apart your specific case, I believe that apart from the Dropout the dataset split may sometimes result in this situation. Especially if the dataset split is not random (in case where temporal or spatial patterns exist) the validation set may be fundamentally different, i.e less noise or less variance, from the train and thus easier to to predict leading to higher accuracy on the validation set than on training.

Moreover, if the validation set is very small compared to the training then by random the model fits better the validation set than the training.]

like image 44
Mewtwo Avatar answered Oct 01 '22 12:10

Mewtwo