Logistic regression model using statesmodels:
log_reg = st.logit(formula = 'label ~ pregnant + glucose + bp + insulin + bmi + pedigree + age', data=pima).fit()
is there any short way of writing second part of formula (pregnant + glucose + bp + insulin + bmi + pedigree + age)? Here all the columns have to be mentioned explicitly. If there are more than 100 columns, it would be difficult to write and also the statement would be very long.
If df is a pd.DataFrame, and y is the target variable, this function returns a string of the formula you are looking for.
def formula_from_cols(df, y):
return y + ' ~ ' + ' + '.join([col for col in df.columns if not col==y])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With