Why is validation accuracy higher than training accuracy when applying data augmentation?

Tags:

I am working on an image classification problem in Keras.

I am training the model using model.fit_generator for data augmentation. While training per epoch, I am also evaluating on validation data.

Training is done on 90% of the data and Validation is done on 10% of the data. The following is my code:

datagen = ImageDataGenerator(
    rotation_range=20,
    zoom_range=0.3)


batch_size=32
epochs=30

model_checkpoint = ModelCheckpoint('myweights.hdf5', monitor='val_acc', verbose=1, save_best_only=True, mode='max')

lr = 0.01
sgd = SGD(lr=lr, decay=1e-6, momentum=0.9, nesterov=False)
model.compile(loss='categorical_crossentropy',
          optimizer=sgd,
          metrics=['accuracy'])



def step_decay(epoch):
    # initialize the base initial learning rate, drop factor, and
    # epochs to drop every
    initAlpha = 0.01
    factor = 1
    dropEvery = 3

    # compute learning rate for the current epoch
    alpha = initAlpha * (factor ** np.floor((1 + epoch) / dropEvery))

    # return the learning rate
    return float(alpha)



history=model.fit_generator(datagen.flow(xtrain, ytrain, batch_size=batch_size),
                    steps_per_epoch=xtrain.shape[0] // batch_size,
                  callbacks[LearningRateScheduler(step_decay),model_checkpoint],
                    validation_data = (xvalid, yvalid),
                    epochs = epochs, verbose = 1)

However, upon plotting the training accuracy and validation accuracy (as well as the training loss and validation loss), I noticed the validation accuracy is higher than training accuracy (and likewise, validation loss is lower than training loss). Here are my resultant plots after training (please note that validation is referred to as "test" in the plots):

enter image description here

When I do not apply data augmentation, the training accuracy is higher than the validation accuracy.From my understanding, the training accuracy should typically be greater than validation accuracy. Can anyone give insights why this is not the case in my situation where data augmentation is applied?

670

asked Feb 17 '18 19:02

user121

1 Answers

The following is just a theory, but it is one that you can test!

One possible explanation why your validation accuracy is better than your training accuracy, is that the data augmentation you are applying to the training data is making the task significantly harder for the network. (It's not totally clear from your code sample. but it looks like you are applying the augmentation only to your training data, not your validation data).

To see why this might be the case, imagine you are training a model to recognise whether someone in the picture is smiling or frowning. Most pictures of faces have the face the "right way up" so the model could solve the task by recognising the mouth and measuring if it curves upwards or downwards. If you now augment the data by applying random rotations, the model can no longer focus just on the mouth, as the face could be upside down. In addition to recognising the mouth and measuring its curve, the model now also has to work out the orientation of the face as a whole and compare the two.

In general, applying random transformations to your data is likely to make it harder to classify. This can be a good thing as it makes your model more robust to changes in the input, but it also means that your model gets an easier ride when you test it on non-augmented data.

This explanation might not apply to your model and data, but you can test it in two ways:

If you decrease the range of the augmentation transformations you are using you should see the training and validation loss get closer together.
If you apply the exact same augmentation transformations to the validation data as you do the training data, then you should see the validation accuracy drop below the training accuracy as you expected.

176

answered Nov 11 '22 20:11

myrtlecat

Related questions
                            
                                How to split a model trained in keras?
                            
                                Handle invalid/corrupted image files in ImageDataGenerator.flow_from_directory in Keras
                            
                                XGBModel' object has no attribute 'evals_result_'
                            
                                How to train a neural network model with bert embeddings instead of static embeddings like glove/fasttext?
                            
                                Regarding odd image dimensions in Pytorch
                            
                                How inverting the dropout compensates the effect of dropout and keeps expected values unchanged?
                            
                                How are the TokenEmbeddings in BERT created?
                            
                                Balanced Accuracy Score in Tensorflow
                            
                                Display Pytorch tensor as image using Matplotlib
                            
                                Amazon EC2 vs PiCloud [closed]
                            
                                How to deal with missing attribute values in C4.5 (J48) decision tree?
                            
                                Special characters in countVectorizer Scikit-learn
                            
                                How to obtain the training error in svm of Scikit-learn?
                            
                                How do I detect if a photo is a poster (not realistic)?
                            
                                How do I do classification using TfidfVectorizer plus metadata in practice?
                            
                                Caffe output layer number accuracy
                            
                                Not reading all rows while importing csv into pandas dataframe
                            
                                how to obtain the trained best model from a crossvalidator
                            
                                converting scipy.sparse.csr.csr_matrix to a list of lists
                            
                                Adding sparse matrix from CountVectorizer into dataframe with complimentary information for classifier - keep it in sparse format

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is validation accuracy higher than training accuracy when applying data augmentation?

Tags:

machine-learning

deep-learning

keras

user121

People also ask

1 Answers

myrtlecat

Recent Activity

Donate For Us