Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caret xgbLinear and xgbTree

I am using these two derivates of GBM on Caret. I tried both algo on the same dataset and they return different accuracy and perform in different time. By the name, I can think that the first use a Linear function somewhere, and the other use trees. Anyway, I do not understand where it uses the Linear rather than Tree. I know that GBM algorithms use the tree as the predictor, can be that the first cases use different structure like training? Where can I find some documentation on this topic?

Thanks

like image 259
youngz Avatar asked Sep 26 '16 08:09

youngz


1 Answers

You can find more details on the separate models on the caret github page where all the code for the models is located. caret documentation is located here.

But you should be aware of the differences in parameters that are used between the 2 models:

  • xgbLinear uses: nrounds, lambda, alpha, eta
  • xgbTree uses: nrounds, max_depth, eta, gamma, colsample_bytree, min_child_weight

These choices affect the outcome of the model and will result in different predictions. And therefore also have a different accuracy. Other options that are available in xgboost will be used with the default settings of xgboost.

Caret models contain the dotdotdot (...) option. So if you want to train gamma with xgbLinear, you can specify this in the train function. But not in the grid parameter. The same goes for any other xgboost parameter.

(very bad) example:

grid = expand.grid(nrounds = c(10,20), lambda= c(0.1), alpha = c(1), eta = c(0.1))
train(Species ~  ., data = iris, method = "xgbLinear", tuneGrid = grid, gamma = 0.5)
like image 153
phiver Avatar answered Nov 12 '22 23:11

phiver