Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using scikit-learn DecisionTreeClassifier to cluster

When using sklearn.tree.DecisionTreeClassifier, the classifier has methods for predicting probability and class.

Is there a way to use the same tree for clustering: for a given input vector x, simply tell which leaf x belongs to?

like image 745
Guy Adini Avatar asked Jan 16 '13 15:01

Guy Adini


People also ask

Can decision tree perform clustering?

Decision trees can also be used to perform clustering, with a few adjustments. On one hand, new split criteria must be discovered to construct the tree without the knowledge of samples la- bels. On the other hand, new algorithms must be applied to merge sub- clusters at leaf nodes into actual clusters.

Why do we use the DecisionTreeClassifier instead of the DecisionTreeRegressor?

Decision Tree Classifier: It's used to solve classification problems. For example, they are predicting if a person will have their loan approved. Decision Tree Regressor: It's used to solve regression problems. For example, prediction of how many people will die because of an opiate overdose.

How would you import a decision tree classifier in SK learn?

datasets import load_iris >>> from sklearn. model_selection import cross_val_score >>> from sklearn. tree import DecisionTreeClassifier >>> clf = DecisionTreeClassifier(random_state=0) >>> iris = load_iris() >>> cross_val_score(clf, iris. data, iris.


1 Answers

I found the answer to my own question - leaving it here as reference for the next time someone looks for it:

import numpy as np
import sklearn.tree
clf = sklearn.tree.DecisionTreeClassifier()
clf.fit(X,y)
clf.tree_.apply(np.asfortranarray(X.astype(sklearn.tree._tree.DTYPE)))
like image 167
Guy Adini Avatar answered Oct 19 '22 06:10

Guy Adini