As I understand it when creating a supervised learning model, our model may have high bias if we are making very simple assumptions (for example if our function is linear) which cause the algorithm to miss relationships between our features and target output resulting in errors. This is underfitting.
On the other hand if we make our algorithm too strong (many polynomial features), it'll be very sensitive to small fluctuations in our training set causing ovefitting: modeling the random noise in the training data, rather than the intended outputs. This is overfitting.
This makes sense to me, but I heard that a model can have both high variance and high bias and I just don't understand how that would possible. If high bias and high variance are synonyms for underfitting and overfitting, then how can you have both overfitting and underfitting on the same model? Is it possible? How can it happen? What does it look like when it does happen?
Overfitting, Underfitting in Regression Due to the low flexibility of a linear equation, it is not able to predict the samples (training data), therefore the error rate is high and it has a High Bias which in turn means it's underfitting. This model won't perform well on unseen data.
both overfitting and underfitting are measured in relative terms, so yes, it is possible to have both at the same time.
overfitting happens when our model captures the noise along with the underlying pattern in data. It happens when we train our model a lot over noisy datasets. These models have low bias and high variance. These models are very complex like Decision trees which are prone to overfitting.
Answer: Underfitted models have low bias. Overfitted models have low variance.
Imagine a regression problem. I define a classifier which outputs the maximum of the target variable observed in the training data, for all possible inputs.
This model is both biased (can only represent a singe output no matter how rich or varied the input) and has high variance (the max of a dataset will exhibit a lot of variability between datasets).
You're right to a certain extent that bias means a model is likely to underfit and variance means it's susceptible to overfitting, but they're not quite the same.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With