I am trying to run ANOVA using statsmodels, for which I was making models for every column (categorical feature) in my dataframe wrt to one column 'imp' as follows in a loop.
for cat_feature in df:
data_model = pd.DataFrame({
'x': df[cat_feature],
'y': df['imp']})
model = smf.ols('y ~ x',data=data_model).fit()
res = sm.stats.anova_lm(model, typ=1)
But what I would like to do is this:
smf.ols(df['imp'] ~ df[cat_feature],data=df).fit()
but this isn't the right syntax.
without having to make the data_model each time with one of its column always the same.
Is it possible?
or simply put
model = smf.ols('A~B', data=df).fit()
works but
model2 = smf.ols(df.A ~ df.B, data=df).fit()
doesn't.
The formula interface, lower case ols in contrast to upper case OLS, needs a formula string as first argument.
So, I think you want string concatenation
smf.ols('imp ~' + cat_feature, data=df).fit()
Specifying pandas Series and DataFrames or numpy arrays only works with the main class OLS
import statsmodels.api as sm
model2 = sm.OLS(df['imp'], df[cat_feature]).fit()
As background information:
OLS is the actual model class
ols from formula.api is just a convenient alias for the method OLS.from_formula that preprocesses the formula information before creating an OLS instance.
The character ~ is a required element of the formula string, but it is not valid to separate arguments in regular python classes, methods or functions.
One crucial distinction between the array/dataframe and the formula interface:
The array interface, i.e. using OLS as in
sm.OLS(df['imp'], df[cat_feature])
does not do any preprocessing of the data, i.e. exog is taken as is. Specifically, no constant is added and categorical features are not encoded in some numerical dummy or contrast representation.
The formula interface uses patsy that preprocesses the data, in large parts identically to R's formulas. This means that a constant is added by default and any non-numeric columns, like those that contain strings, are processes as categorical or factor variables.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With