I want to compare different error rates of different classifiers with the error rate from a weak learner (better than random guessing). So, my question is, what are a few choices for a simple, easy to process weak learner? Or, do I understand the concept incorrectly, and is a weak learner simply any benchmark that I choose (for example, a linear regression)?
1. Weak Learners: A 'weak learner' is any ML algorithm (for regression/classification) that provides an accuracy slightly better than random guessing. For example, consider a problem of binary classification with approximately 50% of samples belonging to each class.
However, there are times when ML models are weak learners. Boosting is a way to take several weak models and combine them into a stronger one. Doing this allows you to eliminate bias, improve model accuracy, and boost performance.
→ The weak learners in AdaBoost are decision trees with a single split, called decision stumps. → AdaBoost works by putting more weight on difficult to classify instances and less on those already handled well. → AdaBoost algorithms can be used for both classification and regression problem.
Solution: (A) Weak learners are sure about particular part of a problem. So they usually don't overfit which means that weak learners have low variance and high bias.
better than random guessing
That is basically the only requirement for a weak learner. So long as you can consistently beat random guessing, any true boosting algorithm will be able to increase the accuracy of the final ensemble. What weak learner you should choose is then a trade off between 3 factors:
The classic weak learner is a decision tree. By changing the maximum depth of the tree, you can control all 3 factors. This makes them incredibly popular for boosting. What you should be using depends on your individual problem, but decision trees is a good starting point.
NOTE: So long as the algorithm supports weighted data instances, any algorithm can be used for boosting. A guest speaker at my University was boosting 5 layer deep neural networks for his work in computational biology.
Weak learners are basically thresholds for each feature. One simple example is a 1-level decision tree called decision stump applied in bagging or boosting. It just chooses a threshold for one feature and splits the data on that threshold (for example, to determine whether the iris flower is Iris versicolor or Iris virginica based on the petal width). Then it is trained on this specific feature by bagging or AdaBoost.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With