I have a very simple dataset (30 rows, 32 columns).
I wrote a Python program to load the data and train an XGBoost model, then save the model to disk.
I also compiled a C++ program that uses libxgboost (C api) and loads the model for inference.
When using the SAME saved model, Python and C++ give different results for the same input (a single row of all zeros).
xgboost is 0.90 and I have attached all files (including the numpy data files) here:
https://www.dropbox.com/s/txao5ugq6mgssz8/xgboost_mismatch.tar?dl=0
Here are the outputs of the two programs (the source of which are in the .tar file):
(which prints a few strings while building the model and THEN prints the single number output)
$ python3 jl_functions_tiny.py
Loading data
Creating model
Training model
Saving model
Deleting model
Loading model
Testing model
[587558.2]
(which emits a single number that clearly doesn't match the single Python number output)
$ ./jl_functions
628180.062500
different seed parameter in python and in C++ can cause different result's since there a usage in randomness sin the algorithm , try to set seed= in line 11 xgb.XGBregressor
same in python and in C++ or even via numpy using numpy.random.seed(0)
and in C++ the seed parameter from /workspace/include/xgboost/generic_parameters.h
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With