Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xgboost C api doesn't produce same results as Python

I have a very simple dataset (30 rows, 32 columns).

I wrote a Python program to load the data and train an XGBoost model, then save the model to disk.

I also compiled a C++ program that uses libxgboost (C api) and loads the model for inference.

When using the SAME saved model, Python and C++ give different results for the same input (a single row of all zeros).

xgboost is 0.90 and I have attached all files (including the numpy data files) here:

https://www.dropbox.com/s/txao5ugq6mgssz8/xgboost_mismatch.tar?dl=0

Here are the outputs of the two programs (the source of which are in the .tar file):

The Python program

(which prints a few strings while building the model and THEN prints the single number output)

$ python3 jl_functions_tiny.py
Loading data
Creating model
Training model
Saving model
Deleting model
Loading model
Testing model
[587558.2]

The C++ program

(which emits a single number that clearly doesn't match the single Python number output)

$ ./jl_functions
628180.062500
like image 908
user5406764 Avatar asked Nov 07 '22 13:11

user5406764


1 Answers

different seed parameter in python and in C++ can cause different result's since there a usage in randomness sin the algorithm , try to set seed= in line 11 xgb.XGBregressor same in python and in C++ or even via numpy using numpy.random.seed(0) and in C++ the seed parameter from /workspace/include/xgboost/generic_parameters.h

like image 86
Omer Anisfeld Avatar answered Nov 15 '22 05:11

Omer Anisfeld