Difference between original xgboost (Learning API) and sklearn XGBClassifier (Scikit-Learn API)

Tags:

I use the xgboots sklearn interface below to create and train an xgb model-1.

clf = xgb.XGBClassifier(n_estimators = 100, objective= 'binary:logistic',)
clf.fit(x_train, y_train,  early_stopping_rounds=10, eval_metric="auc", 
    eval_set=[(x_valid, y_valid)])

And the xgboost model can be created by original xgboost as model-2 below:

param = {}
param['objective'] = 'binary:logistic'
param['eval_metric'] = "auc"
num_rounds = 100
xgtrain = xgb.DMatrix(x_train, label=y_train)
xgval = xgb.DMatrix(x_valid, label=y_valid)
watchlist = [(xgtrain, 'train'),(xgval, 'val')]
model = xgb.train(plst, xgtrain, num_rounds, watchlist, early_stopping_rounds=10)

I think all the parameters are the same between model-1 and model-2. But the validation score is different. Is any difference between model-1 and model-2 ?

371

asked Jun 21 '16 11:06

ybdesire

1 Answers

As I understand, there are many differences between default parameters in xgb and in its sklearn interface. For example: default xgb has eta=0.3 while the other has eta=0.1. You can see more about default parameters of each implements here:

https://github.com/dmlc/xgboost/blob/master/doc/parameter.md http://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn

answered Oct 19 '22 11:10

Du Phan

Related questions
                            
                                python subprocess.call() "no such file or directory"
                            
                                Sphinx's .. include:: directive and "duplicate label" warnings
                            
                                Why doesn't Python always require spaces around keywords?
                            
                                Python generator for paged API resource
                            
                                Python 2.7 Unit test: Assert logger warning thrown
                            
                                python iterating over dictionaries
                            
                                Understanding numpy.linalg.norm() in IPython
                            
                                Best Practices for using 'multiprocessing' package in python
                            
                                Generating a CSR with Python (crypto)
                            
                                Best way of removing duplicates from a list by object attribute
                            
                                Julia performance compared to Python+Numba LLVM/JIT-compiled code
                            
                                Speed of K-Nearest-Neighbour build/search with SciKit-learn and SciPy
                            
                                TypeError: encoding or errors without a string argument
                            
                                Python variables lose scope inside generator?
                            
                                django model save - override method not invoked during migrations
                            
                                Plot single data with two Y axes (two units) in matplotlib
                            
                                Update joined table via SQLAlchemy ORM using session.query
                            
                                HandShake Failure in python(_ssl.c:590)
                            
                                How to cache reads?
                            
                                Tensor multiplication with numpy tensordot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Difference between original xgboost (Learning API) and sklearn XGBClassifier (Scikit-Learn API)

Tags:

python

scikit-learn

xgboost

ybdesire

People also ask

1 Answers

Du Phan

Recent Activity

Donate For Us