Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the scores of each feature from sklearn.feature_selection.SelectKBest?

I am trying to get the scores of all the features of my data set.

file_data = numpy.genfromtxt(input_file)
y = file_data[:,-1]
X = file_data[:,0:-1]

x_new = SelectKBest(chi2, k='all').fit_transform(X,y)

Before the first row of X had the "Feature names" in string format but I was getting "Input contains NaN, infinity or a value too large for dtype('float64')" error. So, now X contains only the data and y contains the target values(1,-1).

How can I get the score of each feature from SelectKBest(trying to use Uni-variate feature selection)?

thanks

like image 613
Black Dragon Avatar asked Dec 14 '22 11:12

Black Dragon


1 Answers

Solution

You just have to do something like this.

file_data = numpy.genfromtxt(input_file)
y = file_data[:,-1]
X = file_data[:,0:-1]

selector = SelectKBest(chi2, k='all').fit(X,y)
x_new = selector.transform(X) # not needed to get the score
scores = selector.scores_


Your problem

When you use directly .fit_transform(features, target), the selector is not stored and you are returning the selected features. However, the scores is an attribute of the selector. In order to get it, you have to use .fit(features, target). Once you have your selector fitted, you can get the selected features by calling selector.transform(features), as you can see in the code avobe.

As I commented in the code, you don't need to have transformed the features to get the score. Just with fitting them is enough.


Links

  • Documentation about SelectKBest in sklearn
  • Example in the docs
like image 187
Daniel Reina Avatar answered Apr 27 '23 11:04

Daniel Reina