xgboost.train versus XGBClassifier

Question

I am using python to fit an xgboost model incrementally (chunk by chunk). I came across a solution that uses xgboost.train but I do not know what to do with the Booster object that it returns. For instance, the XGBClassifier has options like fit, predict, predict_proba etc.

Here is what happens inside the for loop that I am reading in the data little by little:

dtrain=xgb.DMatrix(X_train, label=y)
param = {'max_depth':2, 'eta':1, 'silent':1, 'objective':'binary:logistic'}
modelXG=xgb.train(param,dtrain,xgb_model='xgbmodel')
modelXG.save_model("xgbmodel")

Vivek Kumar · Accepted Answer

XGBClassifier is a scikit-learn compatible class which can be used in conjunction with other scikit-learn utilities.

Other than that, its just a wrapper over the xgb.train, in which you dont need to supply advanced objects like Booster etc.

Just send your data to fit(), predict() etc and internally it will be converted to appropriate objects automatically.

Mischa Lisovyi · Answer

I'm not entirely sure what was your question. xgb.XGBMClassifier.fit() under the hood calls xgb.train() so it is a matter of matching us arguments of relevant functions.

If you are interested how to implement the learning that you have in mind, then you can do

clf = xgb.XGBClassifier(**params)
clf.fit(X, y, xgb_model=your_model)

See the documentation here. On each iteration you will have to save the booster using something like clf.get_booster().save_model(xxx).

PS I hope you do learning in mini-batches, i.e. chunks and not literally line-by-line, i.e. example-by-example, as that would result in performance drop due to writing/reading the model each time

xgboost.train versus XGBClassifier

Tags:

python

scikit-learn

xgboost

Max

2 Answers

Vivek Kumar

Mischa Lisovyi

Recent Activity

Donate For Us

xgboost.train versus XGBClassifier

Tags:

python

scikit-learn

xgboost

Max

2 Answers

Vivek Kumar

Mischa Lisovyi

Related questions

Recent Activity

Donate For Us