How to perform stepwise regression in python? There are methods for OLS in SCIPY but I am not able to do stepwise. Any help in this regard would be a great help. Thanks.
Edit: I am trying to build a linear regression model. I have 5 independent variables and using forward stepwise regression, I aim to select variables such that my model has the lowest p-value. Following link explains the objective:
https://www.google.co.in/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0CEAQFjAD&url=http%3A%2F%2Fbusiness.fullerton.edu%2Fisds%2Fjlawrence%2FStat-On-Line%2FExcel%2520Notes%2FExcel%2520Notes%2520-%2520STEPWISE%2520REGRESSION.doc&ei=YjKsUZzXHoPwrQfGs4GQCg&usg=AFQjCNGDaQ7qRhyBaQCmLeO4OD2RVkUhzw&bvm=bv.47244034,d.bmk
Thanks again.
This also reduces the compute time and complexity of the problem. Stepwise regression is a technique for feature selection in multiple linear regression. There are three types of stepwise regression: backward elimination, forward selection, and bidirectional elimination. Let us explore what backward elimination is.
1 Answer 1. Scikit-learn indeed does not support stepwise regression. That's because what is commonly known as 'stepwise regression' is an algorithm based on p-values of coefficients of linear regression, and scikit-learn deliberately avoids inferential approach to model learning (significance testing etc).
There are methods for OLS in SCIPY but I am not able to do stepwise. Any help in this regard would be a great help. Thanks. Edit: I am trying to build a linear regression model.
But f_regression does not do stepwise regression but only give F-score and pvalues corresponding to each of the regressors, which is only the first step in stepwise regression. What to do after 1st regressors with the best f-score is chosen? Scikit-learn indeed does not support stepwise regression.
Trevor Smith and I wrote a little forward selection function for linear regression with statsmodels: http://planspace.org/20150423-forward_selection_with_statsmodels/ You could easily modify it to minimize a p-value, or select based on beta p-values with just a little more work.
You may try mlxtend which got various selection methods.
from mlxtend.feature_selection import SequentialFeatureSelector as sfs clf = LinearRegression() # Build step forward feature selection sfs1 = sfs(clf,k_features = 10,forward=True,floating=False, scoring='r2',cv=5) # Perform SFFS sfs1 = sfs1.fit(X_train, y_train)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With