I am trying to get the scores of all the features of my data set.
file_data = numpy.genfromtxt(input_file)
y = file_data[:,-1]
X = file_data[:,0:-1]
x_new = SelectKBest(chi2, k='all').fit_transform(X,y)
Before the first row of X had the "Feature names" in string format but I was getting "Input contains NaN, infinity or a value too large for dtype('float64')" error. So, now X contains only the data and y contains the target values(1,-1).
How can I get the score of each feature from SelectKBest(trying to use Uni-variate feature selection)?
thanks
Solution
You just have to do something like this.
file_data = numpy.genfromtxt(input_file)
y = file_data[:,-1]
X = file_data[:,0:-1]
selector = SelectKBest(chi2, k='all').fit(X,y)
x_new = selector.transform(X) # not needed to get the score
scores = selector.scores_
Your problem
When you use directly .fit_transform(features, target)
, the selector is not stored and you are returning the selected features. However, the scores is an attribute of the selector. In order to get it, you have to use .fit(features, target)
. Once you have your selector fitted, you can get the selected features by calling selector.transform(features)
, as you can see in the code avobe.
As I commented in the code, you don't need to have transformed the features to get the score. Just with fitting them is enough.
Links
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With