Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Feedback: Visualization for Apache Spark Decision Trees [duplicate]

I am using Apache Spark Mllib 1.4.1 (PySpark, the python implementation of Spark) to generate a decision tree based on LabeledPoint data I have. The tree generates correctly and I can print it to the terminal (extract the rules as this user calls it How to extract rules from decision tree spark MLlib) using:

model = DecisionTree.trainClassifier( ... )
print(model.toDebugString()

But what I want to do is visualize or plot the decision tree rather than printing it to the terminal. Is there any way I can plot the decision tree in PySpark or maybe I can save the decision tree data and use R to plot it? Thanks!

like image 933
PyRsquared Avatar asked Nov 21 '22 07:11

PyRsquared


1 Answers

There is this project Decision-Tree-Visualization-Spark for visualizing decision tree model

It has two steps

  • Parse Spark Decision Tree output to a JSON format.
  • Use the JSON file as an input to a D3.js visualization.

For the parser check Dt.py

The input to the function def tree_json(tree) is your models toDebugString()

Answer from question

like image 51
Vishnu667 Avatar answered May 12 '23 18:05

Vishnu667