It is possible to visualize decision trees using pydotplus
from pypi, but it has issues on my machine (it says it was not build with libexpat and thus it only shows a number on a node instead of a table with some information) and I'd like to use an alternative. I already tried using networkx
, but it requires pygraphviz
to read .dot files and make a networkx graph of them. When I tried to install it using pip that also failed.
So now I am looking for an alternative way of visualizing decision trees, which can be installed using pip or anaconda.
Which alternatives exist?
digraph Tree {
node [shape=box, style="filled", color="black"] ;
0 [label="grade.B <= 0.5\ngini = 0.5\nsamples = 37224\nvalue = [18476, 18748]", fillcolor="#399de504"] ;
1 [label="grade.C <= 0.5\ngini = 0.4973\nsamples = 32094\nvalue = [17218, 14876]", fillcolor="#e5813923"] ;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
2 [label="gini = 0.4829\nsamples = 21728\nvalue = [12875, 8853]", fillcolor="#e5813950"] ;
1 -> 2 ;
3 [label="gini = 0.4869\nsamples = 10366\nvalue = [4343, 6023]", fillcolor="#399de547"] ;
1 -> 3 ;
4 [label="grade.A <= 14.8301\ngini = 0.3702\nsamples = 5130\nvalue = [1258, 3872]", fillcolor="#399de5ac"] ;
0 -> 4 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
5 [label="gini = 0.3555\nsamples = 4987\nvalue = [1153, 3834]", fillcolor="#399de5b2"] ;
4 -> 5 ;
6 [label="gini = 0.3902\nsamples = 143\nvalue = [105, 38]", fillcolor="#e58139a3"] ;
4 -> 6 ;
I programmed this in a Jupyter notebook, but that has a bug of not coloring the svg if you try to display the SVG using:

I found a work-around here:
from IPython.display import HTML
svg = None
with open('dtree.svg') as svg_file:
svg =
It's not the sexiest solution but I use the Grapviz CLI (it's called dot
) called via subprocess
, I'm on Mac, so I installed it with homebrew, but you can download binaries for other platforms from their downloads page. Here's an example using the Titanic datset:
import pandas as pd
import subprocess
import seaborn.apionly as sns
fromwd sklearn.preprocessing import Imputer
from sklearn.tree import DecisionTreeClassifier, export_graphviz
raw_data = sns.load_dataset('titanic')
predictors = ['pclass','sex','age','sibsp','parch','fare','embarked','alone','adult_male']
categorical = ['sex','embarked']
numeric = [c for c in predictors if c not in categorical]
encoded_data = pd.get_dummies(raw_data[predictors], columns=categorical)
imputer = Imputer()
X = imputer.fit_transform(encoded_data).astype('float32')
Y = raw_data[target].astype('float32')
model = DecisionTreeClassifier(min_samples_leaf=10, max_depth=3), Y)
impurity=False)['dot', '-Tpdf', '', '-o' 'tree.pdf'])
From version 0.21
scikit-learn has plot_tree
method which plot tree with matplotlib.
The code to use plot_tree
from sklearn import tree
# the clf is Decision Tree object
The alternative to sklearn plots can be dtreeviz
package. The example of the tree is below. The code to use dtreeviz
from dtreeviz.trees import dtreeviz # remember to load the package
# the clf is Decision Tree object
viz = dtreeviz(clf, X, y,
You can find a comparison of different scikit-learn tree plotting techniques here.
