I have a specific technical question about sklearn, random forest classifier.
After fitting the data with the ".fit(X,y)" method, is there a way to extract the actual trees from the estimator object, in some common format, so the ".predict(X)" method can be implemented outside python?
The random forest is a classification algorithm consisting of many decisions trees. It uses bagging and feature randomness when building each individual tree to try to create an uncorrelated forest of trees whose prediction by committee is more accurate than that of any individual tree.
To extract rules from a decision tree, one rule is created for each path from the root to a leaf node. Each splitting criterion along a given path is logically ANDed to form the rule antecedent (“IF” part). The leaf node holds the class prediction, forming the rule consequent (“THEN” part).
Yes, the trees of a forest are stored in the estimators_
attribute of
the forest object.
You can have a look at the implementation of the export_graphviz
function to learn out to write your custom exporter:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/export.py
Here is the usage doc for this function:
http://scikit-learn.org/stable/modules/tree.html#classification
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With