Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the different between h2o.ensemble and h2o.stack in package h2oEnsemble

Tags:

r

h2o

Accoding to the Description of function:

h2o.stack: This function creates a "Super Learner" (stacking) ensemble using a list of existing H2O base models specified by the user.

h2o.ensemble: This function creates a "Super Learner" (stacking) ensemble using the H2O base learning algorithms specified by the user.

like image 518
Tao Hu Avatar asked Feb 23 '17 06:02

Tao Hu


1 Answers

They are two different ways to construct an ensemble. They have a different interface, but they produce the exact same type of object in the end.

  • The h2o.stack() function takes as input a list of already trained (and cross-validated) H2O models, so all it needs to do is the metalearning (combiner) step, which is very fast. This is useful if you want to use a grid of H2O models or a collection of grids of H2O models as the base learners. The only caveat is that all the base learners must have used identical cross-validation folds. If you use fold_assignment = "Modulo" in all the base learners (or grid) that will ensure identical folds.
  • The h2o.ensemble() function allows the user to specify which base models they want in the ensemble and then does the all of the training and cross-validation of the base models, and then does the metalearning (combiner) step as well. This takes much longer since it has to train all the base models as well.

As of the latest stable release (H2O 3.10.3.*), stacking is now available natively in H2O (R, Python, Java, Scala) as the "Stacked Ensemble" method. More info on that here. However, the h2oEnsemble R package (where the h2o.ensemble() and h2o.stack() functions live) will continue to be supported as well.

like image 53
Erin LeDell Avatar answered Sep 22 '22 02:09

Erin LeDell