Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xgboost load model in c++ (python -> c++ prediction scores mismatch)

I'm reaching out to all SO c++ geniuses.

I've trained (and successfully tested) an xgboost model in python like so:

dtrain 
=xgb.DMatrix(np.asmatrix(X_train),label=np.asarray(y_train,dtype=np.int), feature_names=feat_names)

optimal_model = xgb.train(plst, dtrain)

dtest = xgb.DMatrix(np.asmatrix(X_test),feature_names=feat_names)

optimal_model.save_model('sigdet.model')

I've followed a post on the XgBoost (see link) which explains the correct way to load and apply prediction in c++:

// Load Model
g_learner = std::make_unique<Learner>(Learner::Create({}));
        std::unique_ptr<dmlc::Stream> fi(
            dmlc::Stream::Create(filename, "r"));
        g_learner->Load(fi.get());

// Predict
    DMatrixHandle h_test;
        XGDMatrixCreateFromMat((float *)features, 1, numFeatures , -999.9f, &h_test);
        xgboost::bst_ulong out_len;


        std::vector<float> preds;
        g_learner->Predict((DMatrix*)h_test,true, &preds); 

My problem (1): I need to create a DMatrix*, however I only have a DMatrixHandle. How do I properly create a DMatrix with my data?

My problem (2): When I tried the following prediction method:

DMatrixHandle h_test;
XGDMatrixCreateFromMat((float *)features, 1, numFeatures , -999.9f, &h_test);
xgboost::bst_ulong out_len;


int res = XGBoosterPredict(g_modelHandle, h_test, 1, 0, &out_len, (const float**)&scores);

I'm getting completely different scores than by loading the exact same model and using it for prediction (in python).

Whoever helps me achieve consistent results across c++ and python will probably go to heaven. BTW, I need to apply prediction in c++ for a real-time application, otherwise I would use a different language.

like image 992
Leeor Avatar asked Sep 05 '16 17:09

Leeor


1 Answers

To get the DMatrix you can do this:

g_learner->Predict(static_cast<std::shared_ptr<xgboost::DMatrix>*>(h_test)->get(), true, &pred);

For problem (2), I don't have an answer. This is actually the same problem I have. I've got a XGBRegression in python and I obtain different results with the same features in C++.

like image 163
Sabaton7734 Avatar answered Oct 03 '22 15:10

Sabaton7734