Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XGBRegressor: change random_state no effect

the xgboost.XGBRegressor seems to produce the same results despite the fact a new random seed is given.

According to the xgboost documentation xgboost.XGBRegressor:

seed : int Random number seed. (Deprecated, please use random_state)

random_state : int Random number seed. (replaces seed)

random_state is the one to be used, however, no matter what random_state or seed I use, the model produce the same results. A Bug?

from xgboost import XGBRegressor
from sklearn.datasets import load_boston
import numpy as np
from itertools import product

def xgb_train_predict(random_state=0, seed=None):
    X, y = load_boston(return_X_y=True)
    xgb = XGBRegressor(random_state=random_state, seed=seed)
    xgb.fit(X, y)
    y_ = xgb.predict(X)
    return y_

check = xgb_train_predict()

random_state = [1, 42, 58, 69, 72]
seed = [None, 2, 24, 85, 96]

for r, s in product(random_state, seed):
    y_ = xgb_train_predict(r, s)
    assert np.equal(y_, check).all()
    print('CHECK! \t random_state: {} \t seed: {}'.format(r, s))

[Out]:
    CHECK!   random_state: 1     seed: None
    CHECK!   random_state: 1     seed: 2
    CHECK!   random_state: 1     seed: 24
    CHECK!   random_state: 1     seed: 85
    CHECK!   random_state: 1     seed: 96
    CHECK!   random_state: 42    seed: None
    CHECK!   random_state: 42    seed: 2
    CHECK!   random_state: 42    seed: 24
    CHECK!   random_state: 42    seed: 85
    CHECK!   random_state: 42    seed: 96
    CHECK!   random_state: 58    seed: None
    CHECK!   random_state: 58    seed: 2
    CHECK!   random_state: 58    seed: 24
    CHECK!   random_state: 58    seed: 85
    CHECK!   random_state: 58    seed: 96
    CHECK!   random_state: 69    seed: None
    CHECK!   random_state: 69    seed: 2
    CHECK!   random_state: 69    seed: 24
    CHECK!   random_state: 69    seed: 85
    CHECK!   random_state: 69    seed: 96
    CHECK!   random_state: 72    seed: None
    CHECK!   random_state: 72    seed: 2
    CHECK!   random_state: 72    seed: 24
    CHECK!   random_state: 72    seed: 85
    CHECK!   random_state: 72    seed: 96
like image 842
LingxB Avatar asked Jun 11 '18 11:06

LingxB


People also ask

What is the purpose of Random_state parameter?

The random state hyperparameter in the train_test_split() function controls the shuffling process. With random_state=None , we get different train and test sets across different executions and the shuffling process is out of control. With random_state=0 , we get the same train and test sets across different executions.

How do you increase precision in XGBoost?

XGBoost can increase the model's accuracy score by using the best parameters during prediction. After initializing XGBoost, we can use it to train our model. Once again, we use the training set. The model learns from this dataset, stores the knowledge gained in memory, and uses this knowledge when making predictions.

What is Xgb DMatrix?

DMatrix is an internal data structure that is used by XGBoost, which is optimized for both memory efficiency and training speed. You can construct DMatrix from multiple different sources of data. Parameters. data (os. PathLike/string/numpy.

What is N_jobs in XGBoost?

By Harish AmatyaPosted in Questions & Answers 2 years ago. 0. "On larger datasets where runtime is a consideration, you can use parallelism to build your models faster. It's common to set the parameter n_jobs equal to the number of cores on your machine."


1 Answers

It seems (I didn't know it myself before starting to dig for an answer :) ), that xgboost uses random generator only for sub-sampling, see this Laurae's comment on a similar github issue. And otherwise behavior is deterministic.

If you would have used sampling, there is an issue in the seed/random_state handling by the current sklearn API in xgboost. seed is indeed claimed to be deprecated, but it seems that if one provides it, it will still be used over random_state, as can be seen here in the code. This comment is relevant only when you have seed not None

like image 86
Mischa Lisovyi Avatar answered Sep 22 '22 13:09

Mischa Lisovyi