Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XGBoost Best Iteration

I am running a regression using the XGBoost Algorithm as,

clf = XGBRegressor(eval_set = [(X_train, y_train), (X_val, y_val)],
                       early_stopping_rounds = 10, 
                       n_estimators = 10,                    
                       verbose = 50)

clf.fit(X_train, y_train, verbose=False)
print("Best Iteration: {}".format(clf.booster().best_iteration))

It correctly trains itself, but the print function raises the following error,

TypeError: 'str' object is not callable

How can I get the number of the best iteration of the model?

Furthermore, how can I print the training error of each round?

like image 382
Alessandro Ceccarelli Avatar asked Aug 21 '18 19:08

Alessandro Ceccarelli


Video Answer


2 Answers

For your TypeError: use get_booster() instead of booster()

print("Best Iteration: {}".format(clf.get_booster().best_iteration))

To use the number of the best iteration when you predict, you have a parameter called ntree_limit which specify the number of boosters to use. And the value generated from the training process is best_ntree_limit which can be called after training your model in the following matter: clg.get_booster().best_ntree_limit. More specifically when you predict, use:

best_iteration = clg.get_booster().best_ntree_limit
predict(data, ntree_limit=best_iteration)

You can print your training and evaluating process if you specify those parameters in the .fit() command

clf.fit(X_train, y_train,
        eval_set = [(X_train, y_train), (X_val, y_val)],
        eval_metric = 'rmse',
        early_stopping_rounds = 10, verbose=True)

NOTE: early_stopping_rounds parameter should be in the .fit() command not in the XGBRegressor() instantiation.

Another NOTE: verbose = 50 in XGBRegressor() is redundant. The verbose variable should be in your .fit() function and is True or False. For what the verbose=True do, read here under the verbose section. It is directly affects your 3rd question.

like image 77
Eran Moshe Avatar answered Nov 15 '22 14:11

Eran Moshe


Your error is that the booster attribute of XGBRegressor is a string that specifies the kind of booster to use, not the actual booster instance. From the docs:

booster: string
Specify which booster to use: gbtree, gblinear or dart.

In order to get the actual booster, you can call get_booster() instead:

>>> clf.booster
'gbtree'
>>> clf.get_booster()
<xgboost.core.Booster object at 0x118c40cf8>
>>> clf.get_booster().best_iteration
9
>>> print("Best Iteration: {}".format(clf.get_booster().best_iteration))
Best Iteration: 9

I'm not sure about the second half of your question, namely:

Furthermore, how can I print the training error of ** each round**?

but hopefully you're unblocked!

like image 23
Samuel Dion-Girardeau Avatar answered Nov 15 '22 14:11

Samuel Dion-Girardeau