Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between xgb.train and xgb.XGBRegressor (or xgb.XGBClassifier)?

Tags:

I already know "xgboost.XGBRegressor is a Scikit-Learn Wrapper interface for XGBoost."

But do they have any other difference?

like image 984
Statham Avatar asked Nov 07 '17 07:11

Statham


People also ask

What is Xgb train?

xgb. train is an advanced interface for training an xgboost model. The xgboost function is a simpler wrapper for xgb.

What is Xgbregressor?

XGBoost stands for "Extreme Gradient Boosting" and it is an implementation of gradient boosting trees algorithm. The XGBoost is a popular supervised machine learning model with characteristics like computation speed, parallelization, and performance.

Can I use Xgbregressor for classification?

@Baraban no, you can't. You can use squared loss for classification, you cannot use classifier for regression.

What is Xgb DMatrix?

DMatrix is an internal data structure that is used by XGBoost, which is optimized for both memory efficiency and training speed. You can construct DMatrix from multiple different sources of data. Parameters. data (os. PathLike/string/numpy.


2 Answers

xgboost.train is the low-level API to train the model via gradient boosting method.

xgboost.XGBRegressor and xgboost.XGBClassifier are the wrappers (Scikit-Learn-like wrappers, as they call it) that prepare the DMatrix and pass in the corresponding objective function and parameters. In the end, the fit call simply boils down to:

self._Booster = train(params, dmatrix,                       self.n_estimators, evals=evals,                       early_stopping_rounds=early_stopping_rounds,                       evals_result=evals_result, obj=obj, feval=feval,                       verbose_eval=verbose) 

This means that everything that can be done with XGBRegressor and XGBClassifier is doable via underlying xgboost.train function. The other way around it's obviously not true, for instance, some useful parameters of xgboost.train are not supported in XGBModel API. The list of notable differences includes:

  • xgboost.train allows to set the callbacks applied at end of each iteration.
  • xgboost.train allows training continuation via xgb_model parameter.
  • xgboost.train allows not only minization of the eval function, but maximization as well.
like image 91
Maxim Avatar answered Sep 28 '22 17:09

Maxim


@Maxim, as of xgboost 0.90 (or much before), these differences don't exist anymore in that xgboost.XGBClassifier.fit:

  • has callbacks
  • allows contiunation with the xgb_model parameter
  • and supports the same builtin eval metrics or custom eval functions

What I find is different is evals_result, in that it has to be retrieved separately after fit (clf.evals_result()) and the resulting dict is different because it can't take advantage of the name of the evals in the watchlist ( watchlist = [(d_train, 'train'), (d_valid, 'valid')] ) .

like image 28
paulperry Avatar answered Sep 28 '22 15:09

paulperry