Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I make partial plots for DecisionTreeClassifier in scikit-learn (and R)

I have some old code using scikit-learn's DecisionTreeClassifier. I'd like to make partial plots based on this classifier.

All the examples I've seen so far (such as http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.partial_dependence.plot_partial_dependence.html) use "GradientBoostingRegressor" as the classifier.

My question is, is it possible to make partial plots for other classifier?(eg.DecisionTreeClassifier). I've tried the following code:

from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble.partial_dependence import plot_partial_dependence
from sklearn.datasets import make_friedman1

X, y = make_friedman1()
clf = DecisionTreeClassifier(max_features='auto').fit(X,y)
fig, axs = plot_partial_dependence(clf, X, [0, (0, 1)])                                                    

and it doesn't work.

ValueError: gbrt has to be an instance of BaseGradientBoosting

I've found some comments on the internet(Quora):

Partial dependence plots don't depend on the particular choice of classifier at all. The partial dependence plot module used for the gradient boosting example would work fine if you swapped in a random forest classifier.

However, I still don't know how it works.

Also for R, it seems I can make partial plots for randomForest package. However, I'm not exactly sure how the random forest is implemented, in the R manual the author Andy Liaw cite the reference "Friedman, J. (2001). Greedy function approximation: the gradient boosting machine, Ann. of Stat."

Does this mean I have to use gradient-boosting in order to get partial plots?

Any help is appreciated. Thanks a lot!

like image 342
user2921752 Avatar asked Feb 14 '14 20:02

user2921752


1 Answers

As your error message states, you must use a classifier that has a base class of BaseGradientBoosting.

From the documentation you posted:

gbrt : BaseGradientBoosting

A fitted gradient boosting model

Both GradientBoostingClassifier and GradientBoostingRegressor inherit from BaseGradientBoosting (source), so either one of those classes should work, in theory. As for the rest of those classifiers, they do not appear to be supported by the plot_partial_dependence function.

like image 123
James Mnatzaganian Avatar answered Oct 18 '22 22:10

James Mnatzaganian