How to compute AIC for linear regression model in Python?

Tags:

python

linear-regression

I want to compute AIC for linear models to compare their complexity. I did it as follows:

regr = linear_model.LinearRegression()
regr.fit(X, y)

aic_intercept_slope = aic(y, regr.coef_[0] * X.as_matrix() + regr.intercept_, k=1)

def aic(y, y_pred, k):
   resid = y - y_pred.ravel()
   sse = sum(resid ** 2)

   AIC = 2*k - 2*np.log(sse)

return AIC

But I receive a divide by zero encountered in log error.

238

asked Jul 11 '17 11:07

YNR

1 Answers

sklearn's LinearRegression is good for prediction but pretty barebones as you've discovered. (It's often said that sklearn stays away from all things statistical inference.)

statsmodels.regression.linear_model.OLS has a property attribute AIC and a number of other pre-canned attributes.

However, note that you'll need to manually add a unit vector to your X matrix to include an intercept in your model.

from statsmodels.regression.linear_model import OLS
from statsmodels.tools import add_constant

regr = OLS(y, add_constant(X)).fit()
print(regr.aic)

Source is here if you are looking for an alternative way to write manually while still using sklearn.

154

answered Oct 22 '22 14:10

Brad Solomon

Related questions
                            
                                How to create multiple workers in Python-RQ?
                            
                                Python-String to Bytes conversion. Double BackSlash issue
                            
                                Why is Anaconda source activate non-existent?
                            
                                How to change default path for "save the figure" in python?
                            
                                Return a download and rendered page in one Flask response
                            
                                Keras learning rate not changing despite decay in SGD
                            
                                ValueError: Attempted relative import in non-package not for tests package
                            
                                python gettext error: Can't convert '__proxy__' object to str implicitly
                            
                                Python, choose logging files' directory
                            
                                How can I get millisecond and microsecond-resolution timestamps in Python?
                            
                                How to refresh text in Matplotlib?
                            
                                Can I use functions imported from .py files in Dask/Distributed?
                            
                                coloring cells in excel with pandas
                            
                                How to store the result from %%timeit cell magic?
                            
                                Keras showing images from data generator
                            
                                randomly remove rows from dataframe based on condition
                            
                                Why does 000 evaluate to 0 in Python 3? [duplicate]
                            
                                What are the causes of overflow encountered in double_scalars besides division by zero?
                            
                                Feature preprocessing of both continuous and categorical variables (of integer type) with scikit-learn
                            
                                pandas or python equivalent of tidyr complete

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With