Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Statsmodels: Short way of writing Formula

Logistic regression model using statesmodels:

log_reg = st.logit(formula = 'label ~ pregnant + glucose + bp + insulin + bmi + pedigree + age', data=pima).fit()

is there any short way of writing second part of formula (pregnant + glucose + bp + insulin + bmi + pedigree + age)? Here all the columns have to be mentioned explicitly. If there are more than 100 columns, it would be difficult to write and also the statement would be very long.

like image 433
BhushanD Avatar asked Sep 26 '22 07:09

BhushanD


1 Answers

If df is a pd.DataFrame, and y is the target variable, this function returns a string of the formula you are looking for.

def formula_from_cols(df, y):
    return y + ' ~ ' + ' + '.join([col for col in df.columns if not col==y])
like image 193
Mór Kapronczay Avatar answered Oct 11 '22 13:10

Mór Kapronczay