I was trying to find the best features that dominate for the output of my regression model, Following is my code.
seed = 7
np.random.seed(seed)
estimators = []
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=3,
batch_size=20)))
pipeline = Pipeline(estimators)
rfe = RFE(estimator= pipeline, n_features_to_select=5)
fit = rfe.fit(X_set, Y_set)
But I get the following runtime error when running.
RuntimeError: The classifier does not expose "coef_" or "feature_importances_" attributes
How to overcome this issue and select best features for my model? If not, Can I use algorithms like LogisticRegression() provided and supported by RFE in Scikit to achieve the task of finding best features for my dataset?
There are numerous methods for feature selection, which, according to the literature [45] can be divided into three categories: wrapper, filter, and embedded methods. Wrapper methods are those where the model is trained with different combinations of input features to determine which gives the best results.
So, the conclusion is that Deep Learning Networks do not need a previos feature selection step.
What is Feature Selection? Feature Selection is the method of reducing the input variable to your model by using only relevant data and getting rid of noise in data. It is the process of automatically choosing relevant features for your machine learning model based on the type of problem you are trying to solve.
Feature selection is an important technique to improve neural network performances due to the redundant attributes and the massive amount in original data sets. In this paper, a CNN with two convolutional layers followed by a dropout, then two fully connected layers, is equipped with a feature selection algorithm.
I assume your Keras model is some kind of a neural network. And with NN in general it is kind of hard to see which input features are relevant and which are not. The reason for this is that each input feature has multiple coefficients that are linked to it - each corresponding to one node of the first hidden layer. Adding additional hidden layers makes it even more complicated to determine how big of an impact the input feature has on the final prediction.
On the other hand, for linear models it is very straightforward since each feature x_i has a corresponding weight/coefficient w_i and its magnitude directly determines how big of an impact it has in prediction (assuming that features are scaled of course).
The RFE estimator (Recursive feature elimination) assumes that your prediction model has an attribute coef_
(linear models) or feature_importances_
(tree models) that has the length of input features and that it represents their relevance (in absolute terms).
My suggestion:
n_features_to_select
. (Option b) Use regularized linear models like lasso / elastic net that enforce sparsity. The problem here is that you cannot directly set the actual number of selected features. (Option c) Use any other feature selection technique from here.Suggestion:
Perform the RFE algorithm on a sklearn-based algorithm to observe feature importance. Finally, you use the most importantly observed features to train your algorithm based on Keras.
To your question: Standardization is not required for logistic regression
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With