Names features importance plot after preprocessing

Tags:

python

xgboost

Before building a model I make scaling like this

X = StandardScaler(with_mean = 0, with_std = 1).fit_transform(X)

and after build a features importance plot

xgb.plot_importance(bst, color='red')
plt.title('importance', fontsize = 20)
plt.yticks(fontsize = 10)
plt.ylabel('features', fontsize = 20)

enter image description here

The problem is that instead of feature's names we get f0, f1, f2, f3 etc..... How to return feature's names?

thanks

976

asked Jul 26 '16 22:07

Edward

1 Answers

first we get list of feature names before preprocessing

dtrain = xgb.DMatrix( X, label=y)
dtrain.feature_names

Then

bst.get_fscore()
mapper = {'f{0}'.format(i): v for i, v in enumerate(dtrain.feature_names)}
mapped = {mapper[k]: v for k, v in bst.get_fscore().items()}
mapped
xgb.plot_importance(mapped, color='red')

that's all

answered Oct 25 '22 01:10

Edward

Related questions
                            
                                Convert a string tuple to a tuple [duplicate]
                            
                                How to force python print numpy datetime64 with specified timezone?
                            
                                How to delete a session variable in django?
                            
                                Django 1.7 - updating base_site.html not working
                            
                                Python floating point number comparison
                            
                                String matching performance: gcc versus CPython
                            
                                error: [Errno 98] Address already in use
                            
                                Cache busting in Django 1.8?
                            
                                Gensim Word2vec : Semantic Similarity
                            
                                Redirect to other view after submitting form
                            
                                NLTK word tokenize behaviour for double quotation marks is confusing
                            
                                Is there a fast way to find (not necessarily recognize) human speech in an audio file?
                            
                                Error when loading rpy2 with anaconda
                            
                                Matplotlib imshow: how to apply a mask on the matrix
                            
                                Large Pandas Dataframe parallel processing
                            
                                How to avoid the deadlock in a subprocess without using communicate()
                            
                                Tensorflow slicing based on variable
                            
                                Python rolling log to a variable
                            
                                How to install Openpyxl with pip
                            
                                Pandas Rolling Window - datetime64[ns] are not implemented

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Names features importance plot after preprocessing

Tags:

python

xgboost

Edward

People also ask

1 Answers

Edward

Recent Activity

Donate For Us