Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What Does The MAE Actually Telling me?

I've created a simple linear regression model to predict S&P 500 closing prices. then calculated the Mean Absolute Error (MAE) and got an MAE score of 1290. Now, I don't want to know if this is right or wrong but I want to know what MAE of 1290 is telling me about my model.

like image 528
Kabard Avatar asked Oct 29 '16 20:10

Kabard


People also ask

How do you interpret MAE values?

Interpreting the MAE can be easier than interpreting the MSE. Say that you have a MAE of 10. This means that, on average, the MAE is 10 away from the predicted value. In any case, the closer the value of the MAE is to 0, the better.

Is a higher or lower MAE better?

Both the MAE and RMSE can range from 0 to ∞. They are negatively-oriented scores: Lower values are better.

What is a good MAE score?

The closer MAE is to 0, the more accurate the model is. But MAE is returned on the same scale as the target you are predicting for and therefore there isn't a general rule for what a good score is. How good your score is can only be evaluated within your dataset.

What does MAE mean in linear regression?

Mean absolute error (MAE) is a loss function used for regression. Use MAE when you are doing regression and don't want outliers to play a big role. The loss is the mean over the absolute differences between true and predicted values, deviations in either direction from the true value are treated the same way.


2 Answers

To be honest "in general" it tells you nearly nothing. The value is quite arbitrary, and only if you understand exactly your data you can draw any conclusions.

MAE stands for Mean Absolute Error, thus if yours is 1290 it means, that if you randomly choose a data point from your data, then, you would expect your prediction to be 1290 away from the true value. Is it good? Bad? Depends on the scale of your output. If it is in millions, then the error this big is nothing, and the model is good. If your output values are in the range of thousands, this is horrible.

If I understand correctly S&P 500 closing prices are numbers between 0 and 2500 (for last 36 years), thus error of 1290 looks like your model learned nothing. This is pretty much like a constant model, always answering "1200" or something around this value.

like image 161
lejlot Avatar answered Oct 15 '22 04:10

lejlot


MAE obtained with a model should always be verified against a baseline model.

A frequently used baseline is median value assignment. Calculate the MAE for the case when all your predictions are always equal to the median of your target variable vector, then see for yourself if your model's MAE is significantly below that. If it is — congrats.

Note that, in this case the baseline MAE will depend on the target distribution. If your test sample contains lots of instances that are really close to the median, then it will be almost impossible to get a model with a MAE better than the baseline. Thus, MAE should only be used when your test sample is sufficiently diverse. In the extreme case of only 1 instance in the test sample you will get the baseline MAE=0, which will always be no worse than any model you may come up with.

This issue with MAE is especially notable, when you get a MAE for your total sample and then want to check how it changes across different subsamples. Say, you have a model that predicts yearly income based on education, age, marital status etc. You get a MAE of $1.2k, the baseline MAE is $5k, so you conclude that your model is pretty good. Then you want to check how the model deals with bottom-earners and get a MAE of $1.7k with a baseline of $0.5k. The same is likely to occur, if you inspect the errors in the 18-22yo demographics.

like image 22
lotrus28 Avatar answered Oct 15 '22 04:10

lotrus28