How to explain feature importance after one-hot encode used for decision tree

Question

I know decision tree has feature_importance attribute calculated by Gini and it could be used to check which features are more important.

However, for application in scikit-learn or Spark, it only accepts numeric attribute, so I have to transfer string attribute to numeric attribute and then do one-hot encoder on that. When features are put into decision tree model, it's 0-1 encoded other than original format, my question is, how to explain feature importance for original attributes? should I avoid one-hot encoder when try to explain feature importance?

Thanks.

Josh · Accepted Answer

Conceptually, you may want to use something along the lines of permutation importance. The basic idea, is that you take your original dataset, and randomly shuffle the values of each column 1 at a time. Then, you score your perturbed data with the model and compare the performance to the original performance. If done 1 column at a time, you can assess the performance hit you take by destroying each variable, indexing it to the variable that had the most loss (which would become 1, or 100%). If you can do this to your original dataset, prior to the 1 hot encoding, then you'll be getting an importance measure that groups them together overall.

How to explain feature importance after one-hot encode used for decision tree

Tags:

machine-learning

scikit-learn

decision-tree

linpingta

1 Answers

Josh

Recent Activity

Donate For Us

How to explain feature importance after one-hot encode used for decision tree

Tags:

machine-learning

scikit-learn

decision-tree

linpingta

1 Answers

Josh

Related questions

Recent Activity

Donate For Us