Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access train and evaluation error in xgboost

I started using python xgboost backage. Is there a way to get training and validation errors at each training epoch? I can't find one in the documentation

Have trained a simple model and got output:

[09:17:37] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6

[0] eval-rmse:0.407474 train-rmse:0.346349 [09:17:37] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6

1 eval-rmse:0.410902 train-rmse:0.339925 [09:17:38] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6

[2] eval-rmse:0.413563 train-rmse:0.335941 [09:17:38] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6

[3] eval-rmse:0.418412 train-rmse:0.333071 [09:17:38] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6

However I need to pass these eval-rmse and train-rmse further in code or at least plot these curves.

like image 523
MaxPY Avatar asked Feb 04 '16 09:02

MaxPY


People also ask

Can XGBoost be trained in batches?

I created a gist of jupyter notebook to demonstrate that xgboost model can be trained incrementally. I used boston dataset to train the model. I did 3 experiments - one shot learning, iterative one shot learning, iterative incremental learning.

How do you implement early stopping in XGBoost?

To activate early stopping in boosting algorithms like XGBoost, LightGBM and CatBoost, we should specify an integer value in the argument called early_stopping_rounds which is available in the fit() method or train() function of boosting models.


2 Answers

One way to save your intermediate results is by passing evals_result argument to xgb.train method.

Let's say you have created a train and an eval matrix in XGB format, and have initialized some parameters params for XGBoost (In my case, params = {'max_depth':2, 'eta':1, 'silent':1, 'objective':'binary:logistic' }).

  1. Create an empty dict

    progress = dict()

  2. Create a watchlist, (I guess you already have it given that you are printing train-rmse)

    watchlist = [(train,'train-rmse'), (eval, 'eval-rmse')]

  3. Pass these to xgb.train

    bst = xgb.train(param, train, 10, watchlist, evals_result=progress)

At the end of iteration, the progress dictionary will contain the desired train/validation errors

> print progress
{'train-rmse': {'error': ['0.50000', ....]}, 'eval-rmse': { 'error': ['0.5000',....]}}
like image 138
Sudeep Juvekar Avatar answered Sep 28 '22 19:09

Sudeep Juvekar


@MaxPY, this is in reply to your comment on Sudeep Juvekar's answer above: the keys for your progress dictionary is set to whatever string you pass as the second argument to the watchlist. For instance,

watchlist  = [(train,'train-rmse-demo'), (eval, 'eval-rmse-demo')]

sets the dictionary keys to train-rmse-demo and eval-rmse-demo

like image 33
Sunny Jha Avatar answered Sep 28 '22 17:09

Sunny Jha