What is the right way to measure if a machine learning model has overfit?

Question

I understand the intuitive meaning of overfitting and underfitting. Now, given a particular machine learning model that is trained upon the training data, how can you tell if the training overfitted or underfitted the data? Is there a quantitative way to measure these factors?

Can we look at the error and say if it has overfit or underfit?

Erik · Accepted Answer

I believe the easiest approach is to have two sets of data. Training data and validation data. You train the model on the training data as long as the fitness of the model on the training data is close to the fitness of the model on the validation data. When the models fitness is increasing on the training data but not on the validation data then you're overfitting.

Qnan · Answer

The usual way, I think, is known as cross-validation. The idea is to split the training set into several pieces, known as folds, then pick one at a time for evaluation and train on the remaining ones.

It does not, of course, measure the actual overfitting or underfitting, but if you can vary the complexity of the model, e.g. by changing the regularization term, you can find the optimal point. This is as far as one can go with just training and testing, I think.

What is the right way to measure if a machine learning model has overfit?

Tags:

machine-learning

data-mining

London guy

2 Answers

Erik

Qnan

Recent Activity

Donate For Us

What is the right way to measure if a machine learning model has overfit?

Tags:

machine-learning

data-mining

London guy

2 Answers

Erik

Qnan

Related questions

Recent Activity

Donate For Us