I saw that some xgboost
methods take a parameter num_boost_round
, like this:
model = xgb.cv(params, dtrain, num_boost_round=500, early_stopping_rounds=100)
Others however take n_estimators
like this:
model_xgb = xgb.XGBRegressor(n_estimators=360, max_depth=2, learning_rate=0.1)
As far as I understand, each time boosting is applied a new estimator is created. Is that nor correct?
If that is so, then the numbers num_boost_round
and n_estimators
should be equal, right?
Parameters num_boost_round and early_stopping_rounds This parameter is called num_boost_round and corresponds to the number of boosting rounds or trees to build. Its optimal value highly depends on the other parameters, and thus it should be re-tuned each time you update a parameter.
Setting XGBoost n_estimators=1 makes the algorithm to generate a single tree (no boosting happening basically), which is similar to the single tree algorithm by sklearn - DecisionTreeClassifier. But, the hyperparameters that can be tuned and the tree generation process is different in both.
If that is so, then the numbers num_boost_round and n_estimators should be equal, right? Show activity on this post. Yes they are the same, both referring to the same parameter ( see the docs here, or the github issue ).
The estimator used in each round is not necessarily a tree; xgboost allows the user to create a linear model, a decision tree, or a random forest. And for each iteration, I am making 3 trees?
This could mean that fewer boosting rounds are needed before the model becomes "optimal", although time/resources savings made by using fewer boosting rounds may be consumed by the time/resource needed to construct many random forests. Isn't the point to make tree, correct for errors and repeat?
The reason of the different name is because xgb.XGBRegressor is an implementation of the scikit-learn API; and scikit-learn conventionally uses n_estimators to refer to the number of boosting stages (for example the GradientBoostingClassifier) Show activity on this post. num_parallel_tree in XGBoost native API is equivalent to n_estimators.
Yes they are the same, both referring to the same parameter (see the docs here, or the github issue).
The reason of the different name is because xgb.XGBRegressor
is an implementation of the scikit-learn API; and scikit-learn conventionally uses n_estimators
to refer to the number of boosting stages (for example the GradientBoostingClassifier)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With