Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Names features importance plot after preprocessing

Tags:

python

xgboost

Before building a model I make scaling like this

X = StandardScaler(with_mean = 0, with_std = 1).fit_transform(X)

and after build a features importance plot

xgb.plot_importance(bst, color='red')
plt.title('importance', fontsize = 20)
plt.yticks(fontsize = 10)
plt.ylabel('features', fontsize = 20)

enter image description here

The problem is that instead of feature's names we get f0, f1, f2, f3 etc..... How to return feature's names?

thanks

like image 976
Edward Avatar asked Jul 26 '16 22:07

Edward


People also ask

What is feature importance plot?

The feature importance (variable importance) describes which features are relevant. It can help with better understanding of the solved problem and sometimes lead to model improvements by employing the feature selection.

How do you interpret a feature important graph?

Feature Importance refers to techniques that calculate a score for all the input features for a given model — the scores simply represent the “importance” of each feature. A higher score means that the specific feature will have a larger effect on the model that is being used to predict a certain variable.

What is feature importance in gbm?

Feature importance is defined only for tree boosters. Feature importance is only defined when the decision tree model is chosen as base learner (booster=gbtree). It is not defined for other base learner types, such as linear learners (booster=gblinear).

Does XGBoost give feature importance?

The XGBoost library provides a built-in function to plot features ordered by their importance. features are automatically named according to their index in feature importance graph.


1 Answers

first we get list of feature names before preprocessing

dtrain = xgb.DMatrix( X, label=y)
dtrain.feature_names

Then

bst.get_fscore()
mapper = {'f{0}'.format(i): v for i, v in enumerate(dtrain.feature_names)}
mapped = {mapper[k]: v for k, v in bst.get_fscore().items()}
mapped
xgb.plot_importance(mapped, color='red')

that's all

like image 68
Edward Avatar answered Oct 25 '22 01:10

Edward