Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is pruned and unpruned tree in Weka?

Tags:

java

weka

In decision tree J48 example, when we say tree pruned or unpruned, what is the difference?

like image 502
London guy Avatar asked Jul 20 '12 19:07

London guy


2 Answers

The unpruned trees are larger. What happens is that basically the tree is created according to the implemented algorithm and if pruning is enabled, an additional step looks at what nodes/branches can be removed without affecting the performance too much.

The idea behind pruning is that, apart from making the tree easier to understand, you reduce the risk of overfitting to the training data. That is, being able to classify the training data (almost) perfectly, but nothing else because instead of learning the underlying concept, the tree has learned the properties intrinsic and specific to the training data.

like image 111
Lars Kotthoff Avatar answered Oct 19 '22 03:10

Lars Kotthoff


I would like to add following to Lars' answer. Taken from following link

Many algorithms attempt to "prune", or simplify, their results. Pruning produces fewer, more easily interpreted results. More importantly, pruning can be used as a tool to correct for potential overfitting. ...

J48 employs two pruning methods.

The first is known as subtree replacement. This means that nodes in a decision tree may be replaced with a leaf -- basically reducing the number of tests along a certain path. This process starts from the leaves of the fully formed tree, and works backwards toward the root.

The second type of pruning used in J48 is termed subtree raising. In this case, a node may be moved upwards towards the root of the tree, replacing other nodes along the way. Subtree raising often has a negligible effect on decision tree models. There is often no clear way to predict the utility of the option, though it may be advisable to try turning it off if the induction process is taking a long time. This is due to the fact that subtree raising can be somewhat computationally complex.

like image 26
Atilla Ozgur Avatar answered Oct 19 '22 04:10

Atilla Ozgur