My question is can we choose what Decision Tree algorithm to use in sklearn?
In user guide of sklearn, it mentions optimised version of the CART algorithm is used.
Can we change to other algorithms such as C4.5?
Which one is implemented in scikit-learn? ID3 (Iterative Dichotomiser 3) was developed in 1986 by Ross Quinlan. The algorithm creates a multiway tree, finding for each node (i.e. in a greedy manner) the categorical feature that will yield the largest information gain for categorical targets.
A decision tree is a non-parametric supervised learning algorithm, which is utilized for both classification and regression tasks. It has a hierarchical, tree structure, which consists of a root node, branches, internal nodes and leaf nodes.
Summary. Decision trees are used for handling non-linear data sets effectively.
No. See documentation
scikit-learn uses an optimised version of the CART algorithm.
But there is a params criterion that we can choose to use "gini" or "entropy":
clf = tree.DecisionTreeClassifier(criterion="entropy")
criterion : string, optional (default=”gini”) The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain.
see Docs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With