Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overfitting after one epoch

I am training a model using Keras.

model = Sequential()
model.add(LSTM(units=300, input_shape=(timestep,103), use_bias=True, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(units=536))
model.add(Activation("sigmoid"))
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

while True:
        history = model.fit_generator( 
            generator = data_generator(x_[train_indices],
                    y_[train_indices], batch = batch, timestep=timestep),
                steps_per_epoch=(int)(train_indices.shape[0] / batch), 
                epochs=1, 
                verbose=1, 
                validation_steps=(int)(validation_indices.shape[0] / batch), 
                validation_data=data_generator(
                    x_[validation_indices],y_[validation_indices], batch=batch,timestep=timestep))

It is a multiouput classification accoriding to scikit-learn.org definition: Multioutput regression assigns each sample a set of target values.This can be thought of as predicting several properties for each data-point, such as wind direction and magnitude at a certain location.

Thus, it is a recurrent neural network I tried out different timestep sizes. But the result/problem is mostly the same.

After one epoch, my train loss is around 0.0X and my validation loss is around 0.6X. And this values keep stable for the next 10 epochs.

Dataset is around 680000 rows. Training data is 9/10 and validation data is 1/10.

I ask for intuition behind that..

  • Is my model already over fittet after just one epoch?
  • Is 0.6xx even a good value for a validation loss?

High level question: Therefore it is a multioutput classification task (not multi class), I see the only way by using sigmoid an binary_crossentropy. Do you suggest an other approach?

like image 797
hallo02 Avatar asked May 22 '17 12:05

hallo02


People also ask

Does Epoch cause overfitting?

One epoch leads to underfitting of the curve in the graph (below). As the number of epochs increases, more number of times the weight are changed in the neural network and the curve goes from underfitting to optimal to overfitting curve.

Does number of epochs affect overfitting?

In general too many epochs may cause your model to over-fit the training data. It means that your model does not learn the data, it memorizes the data.

Why do we need more than 1 epoch?

Why do we use multiple epochs? Researchers want to get good performance on non-training data (in practice this can be approximated with a hold-out set); usually (but not always) that takes more than one pass over the training data.


1 Answers

I've experienced this issue and found that the learning rate and batch size have a huge impact on the learning process. In my case, I've done two things.

  • Reduce the learning rate (try 0.00005)
  • Reduce the batch size (8, 16, 32)

Moreover, you can try the basic steps for preventing overfitting.

  • Reduce the complexity of your model
  • Increase the training data and also balance each sample per class.
  • Add more regularization (Dropout, BatchNorm)
like image 172
Ekkalak Thongthanomkul Avatar answered Oct 03 '22 23:10

Ekkalak Thongthanomkul