From a dataset like this:
import pandas as pd
import numpy as np
import statsmodels.api as sm
# A dataframe with two variables
np.random.seed(123)
rows = 12
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df = pd.DataFrame(np.random.randint(100,150,size=(rows, 2)), columns=['y', 'x'])
df = df.set_index(rng)
...and a linear regression model like this:
x = sm.add_constant(df['x'])
model = sm.OLS(df['y'], x).fit()
... you can easily retrieve some model coefficients this way:
print(model.params)
But I just can't find out how to retrieve all other parameters from the model summary:
print(str(model.summary()))
As stated in the question, I'm particularly interested in R-squared.
From the post How to extract a particular value from the OLS-summary in Pandas? I learned that you could just use print(model.r2)
to do the same thing there. But that does not seem to work for statsmodels.
Any suggestions?
You can use the params property of a fitted model to get the coefficients. will print you a numpy array [ 0.89516052 2.00334187] - estimates of intercept and slope respectively. If you want more information, you can use the object result. summary() that contains 3 detailed tables with model description.
Adjusted R-squared. This is defined here as 1 - ( nobs -1)/ df_resid * (1- rsquared ) if a constant is included and 1 - nobs / df_resid * (1- rsquared ) if no constant is included.
A key difference between the two libraries is how they handle constants. Scikit-learn allows the user to specify whether or not to add a constant through a parameter, while statsmodels' OLS class has a function that adds a constant to a given array.
You can get R-squared like:
model.rsquared
import pandas as pd
import numpy as np
import statsmodels.api as sm
# A dataframe with two variables
np.random.seed(123)
rows = 12
rng = pd.date_range('1/1/2017', periods=rows, freq='D')
df = pd.DataFrame(np.random.randint(100,150,size=(rows, 2)), columns=['y', 'x'])
df = df.set_index(rng)
x = sm.add_constant(df['x'])
model = sm.OLS(df['y'], x).fit()
print(model.params)
print(model.rsquared)
print(str(model.summary()))
const 176.636417
x -0.357185
dtype: float64
0.338332793094
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.338
Model: OLS Adj. R-squared: 0.272
Method: Least Squares F-statistic: 5.113
Date: Tue, 30 Jan 2018 Prob (F-statistic): 0.0473
Time: 05:36:04 Log-Likelihood: -41.442
No. Observations: 12 AIC: 86.88
Df Residuals: 10 BIC: 87.85
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 176.6364 20.546 8.597 0.000 130.858 222.415
x -0.3572 0.158 -2.261 0.047 -0.709 -0.005
==============================================================================
Omnibus: 1.934 Durbin-Watson: 1.182
Prob(Omnibus): 0.380 Jarque-Bera (JB): 1.010
Skew: -0.331 Prob(JB): 0.603
Kurtosis: 1.742 Cond. No. 1.10e+03
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.1e+03. This might indicate that there are
strong multicollinearity or other numerical problems.
With a small bit of code:
for attr in dir(model):
if not attr.startswith('_'):
print(attr)
You can see all of the attributes on an object:
HC0_se
HC1_se
HC2_se
HC3_se
aic
bic
bse
centered_tss
compare_f_test
compare_lm_test
compare_lr_test
condition_number
conf_int
conf_int_el
cov_HC0
cov_HC1
cov_HC2
cov_HC3
cov_kwds
cov_params
cov_type
df_model
df_resid
eigenvals
el_test
ess
f_pvalue
f_test
fittedvalues
fvalue
get_influence
get_prediction
get_robustcov_results
initialize
k_constant
llf
load
model
mse_model
mse_resid
mse_total
nobs
normalized_cov_params
outlier_test
params
predict
pvalues
remove_data
resid
resid_pearson
rsquared
rsquared_adj
save
scale
ssr
summary
summary2
t_test
tvalues
uncentered_tss
use_t
wald_test
wald_test_terms
wresid
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With