Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gradient boosting on Vowpal Wabbit

Is there a way to use gradient boosting on regression using Vowpal Wabbit? I use various techniques that come with Vowpal Wabbit that are helpful. I want to try gradient boosting along with that, but I can't find a way to implement gradient boosting on VW.

like image 705
breadnbutter Avatar asked May 03 '15 00:05

breadnbutter


People also ask

What is gradient boosting on decision trees?

Like bagging and boosting, gradient boosting is a methodology applied on top of another machine learning algorithm. Informally, gradient boosting involves two types of models: a "weak" machine learning model, which is typically a decision tree.

How does Vowpal wabbit work?

Vowpal Wabbit handles learning problems with any number of sparse features. It is the first published tera-scale learner1 achieving great scaling. It features distributed, out-of-core learning and pioneered the hashing techniques23, which together make its memory footprint bounded independent of training data size.

Is gradient boosting good for regression?

This example demonstrates Gradient Boosting to produce a predictive model from an ensemble of weak predictive models. Gradient boosting can be used for regression and classification problems.

Is gradient boosting good for classification?

Gradient Boosting has repeatedly proven to be one of the most powerful technique to build predictive models in both classification and regression.


1 Answers

The idea of gradient boosting is that an ensemble model is built from black-box weak models. You can surely use VW as the black box, but note that VW does not offer decision trees, which are the most popular choice for the black-box weak models in boosting. Boosting in general decreases bias (and increases variance), so you should make sure that the VW models have low variance (no overfitting). See bias-variance tradeoff.

There are some reductions related to boosting and bagging in VW:

  • --autolink N adds a link function with polynomial N, which can be considered a simple way of boosting.
  • --log_multi K is an online boosting algorithm for K-class classification. See the paper. You can use it even for binary classification (K=2), but not for regression.
  • --bootstrap M M-way bootstrap by online importance resampling. Use --bs_type=vote for classification and --bs_type=mean for regression. Note that this is bagging, not boosting.
  • --boosting N (added on 2015-06-17) online boosting with N weak learners, see a theoretic paper
like image 190
Martin Popel Avatar answered Sep 26 '22 00:09

Martin Popel