Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Visualizing decision tree in scikit-learn

I am trying to design a simple Decision Tree using scikit-learn in Python (I am using Anaconda's Ipython Notebook with Python 2.7.3 on Windows OS) and visualize it as follows:

from pandas import read_csv, DataFrame from sklearn import tree from os import system  data = read_csv('D:/training.csv') Y = data.Y X = data.ix[:,"X0":"X33"]  dtree = tree.DecisionTreeClassifier(criterion = "entropy") dtree = dtree.fit(X, Y)  dotfile = open("D:/dtree2.dot", 'w') dotfile = tree.export_graphviz(dtree, out_file = dotfile, feature_names = X.columns) dotfile.close() system("dot -Tpng D:.dot -o D:/dtree2.png") 

However, I get the following error:

AttributeError: 'NoneType' object has no attribute 'close' 

I use the following blog post as reference: Blogpost link

The following stackoverflow question doesn't seem to work for me as well: Question

Could someone help me with how to visualize the decision tree in scikit-learn?

like image 755
Ravi Avatar asked Jan 07 '15 11:01

Ravi


2 Answers

Here is one liner for those who are using jupyter and sklearn(18.2+) You don't even need matplotlib for that. Only requirement is graphviz

pip install graphviz 

than run (according to code in question X is a pandas DataFrame)

from graphviz import Source from sklearn import tree Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns)) 

This will display it in SVG format. Code above produces Graphviz's Source object (source_code - not scary) That would be rendered directly in jupyter.

Some things you are likely to do with it

Display it in jupter:

from IPython.display import SVG graph = Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns)) SVG(graph.pipe(format='svg')) 

Save as png:

graph = Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns)) graph.format = 'png' graph.render('dtree_render',view=True) 

Get the png image, save it and view it:

graph = Source( tree.export_graphviz(dtreg, out_file=None, feature_names=X.columns)) png_bytes = graph.pipe(format='png') with open('dtree_pipe.png','wb') as f:     f.write(png_bytes)  from IPython.display import Image Image(png_bytes) 

If you are going to play with that lib here are the links to examples and userguide

like image 79
singer Avatar answered Sep 23 '22 14:09

singer


sklearn.tree.export_graphviz doesn't return anything, and so by default returns None.

By doing dotfile = tree.export_graphviz(...) you overwrite your open file object, which had been previously assigned to dotfile, so you get an error when you try to close the file (as it's now None).

To fix it change your code to

... dotfile = open("D:/dtree2.dot", 'w') tree.export_graphviz(dtree, out_file = dotfile, feature_names = X.columns) dotfile.close() ... 
like image 29
Ffisegydd Avatar answered Sep 22 '22 14:09

Ffisegydd