Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a model have both high bias and high variance? Overfitting and Underfitting?

As I understand it when creating a supervised learning model, our model may have high bias if we are making very simple assumptions (for example if our function is linear) which cause the algorithm to miss relationships between our features and target output resulting in errors. This is underfitting.

On the other hand if we make our algorithm too strong (many polynomial features), it'll be very sensitive to small fluctuations in our training set causing ovefitting: modeling the random noise in the training data, rather than the intended outputs. This is overfitting.

image showing underfitting and overfitting

This makes sense to me, but I heard that a model can have both high variance and high bias and I just don't understand how that would possible. If high bias and high variance are synonyms for underfitting and overfitting, then how can you have both overfitting and underfitting on the same model? Is it possible? How can it happen? What does it look like when it does happen?

like image 968
Alaa Awad Avatar asked Aug 22 '15 21:08

Alaa Awad


People also ask

Is high bias associated with overfitting and underfitting and why?

Overfitting, Underfitting in Regression Due to the low flexibility of a linear equation, it is not able to predict the samples (training data), therefore the error rate is high and it has a High Bias which in turn means it's underfitting. This model won't perform well on unseen data.

Can a model be both underfitting and overfitting?

both overfitting and underfitting are measured in relative terms, so yes, it is possible to have both at the same time.

Do Overfit models have high bias?

overfitting happens when our model captures the noise along with the underlying pattern in data. It happens when we train our model a lot over noisy datasets. These models have low bias and high variance. These models are very complex like Decision trees which are prone to overfitting.

Which of the following are true about bias and variance of Overfitted and Underfitted models?

Answer: Underfitted models have low bias. Overfitted models have low variance.


1 Answers

Imagine a regression problem. I define a classifier which outputs the maximum of the target variable observed in the training data, for all possible inputs.

This model is both biased (can only represent a singe output no matter how rich or varied the input) and has high variance (the max of a dataset will exhibit a lot of variability between datasets).

You're right to a certain extent that bias means a model is likely to underfit and variance means it's susceptible to overfitting, but they're not quite the same.

like image 81
Ben Allison Avatar answered Oct 16 '22 17:10

Ben Allison