Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Feature selection on a keras model

I was trying to find the best features that dominate for the output of my regression model, Following is my code.

seed = 7
np.random.seed(seed)
estimators = []
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=3,
                           batch_size=20)))
pipeline = Pipeline(estimators)
rfe = RFE(estimator= pipeline, n_features_to_select=5)
fit = rfe.fit(X_set, Y_set)

But I get the following runtime error when running.

RuntimeError: The classifier does not expose "coef_" or "feature_importances_" attributes

How to overcome this issue and select best features for my model? If not, Can I use algorithms like LogisticRegression() provided and supported by RFE in Scikit to achieve the task of finding best features for my dataset?

like image 457
Klaus Avatar asked May 06 '18 11:05

Klaus


People also ask

How is feature selection done in deep learning?

There are numerous methods for feature selection, which, according to the literature [45] can be divided into three categories: wrapper, filter, and embedded methods. Wrapper methods are those where the model is trained with different combinations of input features to determine which gives the best results.

Is feature selection necessary for deep learning?

So, the conclusion is that Deep Learning Networks do not need a previos feature selection step.

What are feature selection models?

What is Feature Selection? Feature Selection is the method of reducing the input variable to your model by using only relevant data and getting rid of noise in data. It is the process of automatically choosing relevant features for your machine learning model based on the type of problem you are trying to solve.

Can CNN be used for feature selection?

Feature selection is an important technique to improve neural network performances due to the redundant attributes and the massive amount in original data sets. In this paper, a CNN with two convolutional layers followed by a dropout, then two fully connected layers, is equipped with a feature selection algorithm.


2 Answers

I assume your Keras model is some kind of a neural network. And with NN in general it is kind of hard to see which input features are relevant and which are not. The reason for this is that each input feature has multiple coefficients that are linked to it - each corresponding to one node of the first hidden layer. Adding additional hidden layers makes it even more complicated to determine how big of an impact the input feature has on the final prediction.

On the other hand, for linear models it is very straightforward since each feature x_i has a corresponding weight/coefficient w_i and its magnitude directly determines how big of an impact it has in prediction (assuming that features are scaled of course).

The RFE estimator (Recursive feature elimination) assumes that your prediction model has an attribute coef_ (linear models) or feature_importances_(tree models) that has the length of input features and that it represents their relevance (in absolute terms).

My suggestion:

  1. Feature selection: (Option a) Run the RFE on any linear / tree model to reduce the number of features to some desired number n_features_to_select. (Option b) Use regularized linear models like lasso / elastic net that enforce sparsity. The problem here is that you cannot directly set the actual number of selected features. (Option c) Use any other feature selection technique from here.
  2. Neural Network: Use only features from (1) for your neural network.
like image 140
Jan K Avatar answered Sep 20 '22 13:09

Jan K


Suggestion:

Perform the RFE algorithm on a sklearn-based algorithm to observe feature importance. Finally, you use the most importantly observed features to train your algorithm based on Keras.

To your question: Standardization is not required for logistic regression

like image 23
Hendouz Avatar answered Sep 22 '22 13:09

Hendouz