Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does 'statsmodels' or another Python package offer an equivalent to R's 'step' function?

Is there a statsmodels or other Python equivalent for R's step functionality for selecting a formula-based model using AIC?

like image 282
orome Avatar asked Mar 15 '14 19:03

orome


People also ask

What is the use of statsmodels in Python?

Statsmodels is a Python package that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator.

Is statsmodels a package?

statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.

What is statsmodels formula API?

statsmodels. formula. api : A convenience interface for specifying models using formula strings and DataFrames. This API directly exposes the from_formula class method of models that support the formula API.


1 Answers

I really suspect that you are doing the same online course as I do -- the following allows you to get the right answers. If the task at hand is not very computationally heavy (and it isn't in the course), then we can sidestep all the smart details of the step function, and just try all the subsets of the predictors.

For each subset we can calculate AIC as ACI = 2*nvars - 2*result.llf.
And then we just find a subset with the minimal AIC:

import itertools
import numpy as np
import pandas as pd
import statsmodels.api as sm
AICs = {}
for k in range(1,len(predictorcols)+1):
    for variables in itertools.combinations(predictorcols, k):
        predictors = train[list(variables)]
        predictors['Intercept'] = 1
        res = sm.OLS(target, predictors).fit()
        AICs[variables] = 2*(k+1) - 2*res.llf
pd.Series(AICs).idxmin()
like image 92
Kostya Avatar answered Nov 06 '22 14:11

Kostya