I started using python xgboost
backage. Is there a way to get training and validation errors at each training epoch? I can't find one in the documentation
Have trained a simple model and got output:
[09:17:37] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6
[0] eval-rmse:0.407474 train-rmse:0.346349 [09:17:37] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6
1 eval-rmse:0.410902 train-rmse:0.339925 [09:17:38] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6
[2] eval-rmse:0.413563 train-rmse:0.335941 [09:17:38] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6
[3] eval-rmse:0.418412 train-rmse:0.333071 [09:17:38] src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6
However I need to pass these eval-rmse
and train-rmse
further in code or at least plot these curves.
I created a gist of jupyter notebook to demonstrate that xgboost model can be trained incrementally. I used boston dataset to train the model. I did 3 experiments - one shot learning, iterative one shot learning, iterative incremental learning.
To activate early stopping in boosting algorithms like XGBoost, LightGBM and CatBoost, we should specify an integer value in the argument called early_stopping_rounds which is available in the fit() method or train() function of boosting models.
One way to save your intermediate results is by passing evals_result
argument to xgb.train
method.
Let's say you have created a train
and an eval
matrix in XGB format, and have initialized some parameters params
for XGBoost (In my case, params = {'max_depth':2, 'eta':1, 'silent':1, 'objective':'binary:logistic' }
).
Create an empty dict
progress = dict()
Create a watchlist, (I guess you already have it given that you are printing train-rmse)
watchlist = [(train,'train-rmse'), (eval, 'eval-rmse')]
Pass these to xgb.train
bst = xgb.train(param, train, 10, watchlist, evals_result=progress)
At the end of iteration, the progress
dictionary will contain the desired train/validation errors
> print progress
{'train-rmse': {'error': ['0.50000', ....]}, 'eval-rmse': { 'error': ['0.5000',....]}}
@MaxPY, this is in reply to your comment on Sudeep Juvekar's answer above: the keys for your progress dictionary is set to whatever string you pass as the second argument to the watchlist. For instance,
watchlist = [(train,'train-rmse-demo'), (eval, 'eval-rmse-demo')]
sets the dictionary keys to train-rmse-demo
and eval-rmse-demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With