Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wrapper Methods for feature selection (Machine Learning) In Scikit Learn

I am trying to decide between scikit learn and the weka data mining tool for my machine learning project. However I realized the need for feature selection. I would like to know if scikit learn has wrapper methods for feature selection.

like image 209
Sean Sog Miller Avatar asked Feb 25 '16 23:02

Sean Sog Miller


People also ask

What are wrapper and filters in feature selection?

Filter methods measure the relevance of features by their correlation with dependent variable while wrapper methods measure the usefulness of a subset of feature by actually training a model on it. Filter methods are much faster compared to wrapper methods as they do not involve training the models.

What are the three types of feature selection methods?

Overview. There are three types of feature selection: Wrapper methods (forward, backward, and stepwise selection), Filter methods (ANOVA, Pearson correlation, variance thresholding), and Embedded methods (Lasso, Ridge, Decision Tree).

Which method can be used for feature selection?

Exhaustive Feature Selection- Exhaustive feature selection is one of the best feature selection methods, which evaluates each feature set as brute-force. It means this method tries & make each possible combination of features and return the best performing feature set.

Is RFE a wrapper method?

Recursive Feature Elimination(RFE) is the Wrapper method, i.e., it can ta. This algorithm fits a model and determines how significant features explain the variation in the dataset. Once the feature importance has been determined, it then removes those less important features one at a time in each iteration.


1 Answers

scikit-learn supports Recursive Feature Elimination (RFE), which is a wrapper method for feature selection.

mlxtend, a separate Python library that is designed to work well with scikit-learn, also provides a Sequential Feature Selector (SFS) that works a bit differently:

RFE is computationally less complex using the feature's weight coefficients (e.g., linear models) or feature importances (tree-based algorithms) to eliminate features recursively, whereas SFSs eliminate (or add) features based on a user-defined classifier/regression performance metric.

like image 161
Kevin Markham Avatar answered Nov 10 '22 05:11

Kevin Markham