Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to compute AIC for linear regression model in Python?

I want to compute AIC for linear models to compare their complexity. I did it as follows:

regr = linear_model.LinearRegression()
regr.fit(X, y)

aic_intercept_slope = aic(y, regr.coef_[0] * X.as_matrix() + regr.intercept_, k=1)

def aic(y, y_pred, k):
   resid = y - y_pred.ravel()
   sse = sum(resid ** 2)

   AIC = 2*k - 2*np.log(sse)

return AIC

But I receive a divide by zero encountered in log error.

like image 238
YNR Avatar asked Jul 11 '17 11:07


People also ask

How do you calculate AIC in linear regression?

AIC = -2(log-likelihood) + 2K Where: K is the number of model parameters (the number of variables in the model plus the intercept). Log-likelihood is a measure of model fit.

What is AIC in linear regression?

The Akaike information criterion (AIC) is a metric that is used to compare the fit of different regression models. It is calculated as: AIC = 2K – 2ln(L)

Can AIC be used for regression?

In regression, AIC is asymptotically optimal for selecting the model with the least mean squared error, under the assumption that the "true model" is not in the candidate set.

What is a good AIC value in linear regression?

The AIC function is 2K – 2(log-likelihood). Lower AIC values indicate a better-fit model, and a model with a delta-AIC (the difference between the two AIC values being compared) of more than -2 is considered significantly better than the model it is being compared to.

1 Answers

sklearn's LinearRegression is good for prediction but pretty barebones as you've discovered. (It's often said that sklearn stays away from all things statistical inference.)

statsmodels.regression.linear_model.OLS has a property attribute AIC and a number of other pre-canned attributes.

However, note that you'll need to manually add a unit vector to your X matrix to include an intercept in your model.

from statsmodels.regression.linear_model import OLS
from statsmodels.tools import add_constant

regr = OLS(y, add_constant(X)).fit()

Source is here if you are looking for an alternative way to write manually while still using sklearn.

like image 154
Brad Solomon Avatar answered Oct 22 '22 14:10

Brad Solomon