Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overfitting after first epoch

I am using convolutional neural networks (via Keras) as my model for facial expression recognition (55 subjects). My data set is quite hard and around 450k with 7 classes. I have balanced my training set per subject and per class label.

I implemented a very simple CNN architecture (with real-time data augmentation):

model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode=borderMode, init=initialization,  input_shape=(48, 48, 3)))
model.add(BatchNormalization())
model.add(PReLU())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(256))
model.add(BatchNormalization())
model.add(PReLU())
model.add(Dropout(0.5))

model.add(Dense(nb_output))
model.add(Activation('softmax'))

After first epoch, my training loss decreases constantly while validation loss increases. Could overfitting happen that soon? Or is there a problem with my data being confusing? Should I also balance my testing set?

like image 337
Renz Avatar asked Oct 09 '16 14:10

Renz


People also ask

Does Epoch cause overfitting?

So, updating the weights with single pass or one epoch is not enough. One epoch leads to underfitting of the curve in the graph (below). As the number of epochs increases, more number of times the weight are changed in the neural network and the curve goes from underfitting to optimal to overfitting curve.

Does number of epochs affect overfitting?

In general too many epochs may cause your model to over-fit the training data. It means that your model does not learn the data, it memorizes the data. You have to find the accuracy of validation data for each epoch or maybe iteration to investigate whether it over-fits or not.

When can overfitting happen?

Overfitting is a concept in data science, which occurs when a statistical model fits exactly against its training data. When this happens, the algorithm unfortunately cannot perform accurately against unseen data, defeating its purpose.

Is training 1 epoch enough?

A single epoch in training is not enough and leads to underfitting.


1 Answers

It could be that the task is easy to solve and after one epoch the model has learned enough to solve it, and training for more epochs just increases overfitting.

But if you have balanced the train set and not the test set, what may be happening is that you are training for one task (expression recognition on evenly distributed data) and then you are testing on a slightly different task, because the test set is not balanced.

like image 99
Guillem Cucurull Avatar answered Oct 10 '22 23:10

Guillem Cucurull