Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine an overfitted model based on loss precision and recall

I've written an LSTM network with Keras (following code):

    df = pd.read_csv("../data/training_data.csv")

    # Group by and pivot the data
    group_index = df.groupby('group').cumcount()
    data = (df.set_index(['group', group_index])
            .unstack(fill_value=0).stack())

    # getting np array of the data and labeling
    # on the label group we take the first label because it is the same for all
    target = np.array(data['label'].groupby(level=0).apply(lambda x: [x.values[0]]).tolist())
    data = data.loc[:, data.columns != 'label']
    data = np.array(data.groupby(level=0).apply(lambda x: x.values.tolist()).tolist())

    # shuffel the training set
    data, target = shuffle(data, target)

    # spilt data to train and test
    x_train, x_test, y_train, y_test = train_test_split(data, target, test_size=0.2, random_state=4)

    # ADAM Optimizer with learning rate decay
    opt = optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0001)

    # build the model
    model = Sequential()

    num_features = data.shape[2]
    num_samples = data.shape[1]

    model.add(LSTM(8, batch_input_shape=(None, num_samples, num_features), return_sequences=True, activation='sigmoid'))
    model.add(LeakyReLU(alpha=.001))
    model.add(Dropout(0.2))
    model.add(LSTM(4, return_sequences=True, activation='sigmoid'))
    model.add(LeakyReLU(alpha=.001))
    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))

    model.compile(loss='binary_crossentropy', optimizer=opt,
                  metrics=['accuracy', keras_metrics.precision(), keras_metrics.recall(),f1])

    model.summary()


    # Training, getting the results history for plotting
    history = model.fit(x_train, y_train, epochs=3000, validation_data=(x_test, y_test))

The monitored metrics are loss, accuracy, precision, recall and f1 score.

I've noticed that the validation loss metric start to climb around 300 epochs, so I've figured overfitting! however, recall is still climbing and precision is slightly improving.


enter image description here enter image description here enter image description here


Why is that? is my model overfitted?

like image 958
Shlomi Schwartz Avatar asked Oct 16 '18 08:10

Shlomi Schwartz


People also ask

How do you determine if a model is overfitting?

Overfitting can be identified by checking validation metrics such as accuracy and loss. The validation metrics usually increase until a point where they stagnate or start declining when the model is affected by overfitting.

How do I know if my model is overfitting or Underfitting?

We can determine whether a predictive model is underfitting or overfitting the training data by looking at the prediction error on the training data and the evaluation data. Your model is underfitting the training data when the model performs poorly on the training data.

How do I know if my model is overfitting?

The loss formula is trivially expanded to n > 1 observations by taking the average of the loss of all observations. is my model overfitted? In order to determine this, you have to compare training loss and validation loss. You cannot tell by validation loss alone. If training loss decreases and validation loss increases, your model is overfitting.

What happens to precision and recall when a model loses accuracy?

In both cases the precision and recall metrics will stay unchanged (the class label is still predicted correctly), however the model loss has increased. In general terms, the model has become "less sure" about it's prediction, but it is still correct.

How do you calculate f measure with precision and recall?

Once precision and recall have been calculated for a binary or multiclass classification problem, the two scores can be combined into the calculation of the F-Measure. The traditional F measure is calculated as follows: F-Measure = (2 * Precision * Recall) / (Precision + Recall) This is the harmonic mean of the two fractions.

What is an overfit regression model?

Thus, overfitting a regression model reduces its generalizability outside the original dataset. Taking the above in combination, an overfit regression model describes the noise, and it’s not applicable outside the sample. That’s not very helpful, right?


3 Answers

the validation loss metric start to climb around 300 epochs (...) recall is still climbing and precision is slightly improving. (...) Why is that?

Precision and recall are measures of how well your classifier performs in terms of the predicted class labels. Model loss on the other hand is a measure of the cross entropy, the error in classification probability :

![![![enter image description here

where

y = predicted label
p = probability of predicted label

For example, the (softmax) outputs of the model for one observation may look like this for different epochs, say

# epoch 300
y = [0.1, 0.9] => argmax(y) => 1 (class label 1)
loss = -(1 * log(0.9)) = 0.10

# epoch 500
y = [0.4, 0.6] => argmax(y) => 1 (class label 1)
loss = -(1 * log(0.6)) = 0.51

In both cases the precision and recall metrics will stay unchanged (the class label is still predicted correctly), however the model loss has increased. In general terms, the model has become "less sure" about it's prediction, but it is still correct.

Note in your model the loss is calculated for all observations, not just a single one. I limit the discussion for simplicity. The loss formula is trivially expanded to n > 1 observations by taking the average of the loss of all observations.

is my model overfitted?

In order to determine this, you have to compare training loss and validation loss. You cannot tell by validation loss alone. If training loss decreases and validation loss increases, your model is overfitting.

like image 75
miraculixx Avatar answered Oct 29 '22 05:10

miraculixx


Indeed, if the validation loss starts growing up again, then you may want to stop early. It's a "standard" approach, named "early stopping" (https://en.wikipedia.org/wiki/Early_stopping). Clearly, if the loss for your validation and data is increasing, then the model is not doing as great as it could, it is overfitting.

Precision and recall are not enough, they can increase if your model is giving more positive results, less negative ones (for instance 9 positives for 1 negative). Then these ratios can seem to be improved, but it's just that you have less true negatives.

All these two put together can help shed some light as to what is happening here. The good answers may still be good ones, but with a lower quality (the loss for individual samples increases on average, but still keeps good answers good), and there could be a shift from bad answers to good answers with a bias (true negatives are transformed into false positives).

like image 22
Matthieu Brucher Avatar answered Oct 29 '22 06:10

Matthieu Brucher


AS @Matthieu mentioned, it could be biased to look at precision and recall of one class alone. May be we have to look at performance on other class as well.

Better measure could be concordance (auc of roc), in case of a binary classification. Concordance measure the goodness of the model to rank-order the datapoints based on its likeliness towards to a class.

One more option is Macro/Micro-precision/recall to get the complete picture of the model performance.

like image 1
Venkatachalam Avatar answered Oct 29 '22 07:10

Venkatachalam