I'm reaching out to all SO c++ geniuses.
I've trained (and successfully tested) an xgboost model in python like so:
dtrain
=xgb.DMatrix(np.asmatrix(X_train),label=np.asarray(y_train,dtype=np.int), feature_names=feat_names)
optimal_model = xgb.train(plst, dtrain)
dtest = xgb.DMatrix(np.asmatrix(X_test),feature_names=feat_names)
optimal_model.save_model('sigdet.model')
I've followed a post on the XgBoost (see link) which explains the correct way to load and apply prediction in c++:
// Load Model
g_learner = std::make_unique<Learner>(Learner::Create({}));
std::unique_ptr<dmlc::Stream> fi(
dmlc::Stream::Create(filename, "r"));
g_learner->Load(fi.get());
// Predict
DMatrixHandle h_test;
XGDMatrixCreateFromMat((float *)features, 1, numFeatures , -999.9f, &h_test);
xgboost::bst_ulong out_len;
std::vector<float> preds;
g_learner->Predict((DMatrix*)h_test,true, &preds);
My problem (1): I need to create a DMatrix*, however I only have a DMatrixHandle. How do I properly create a DMatrix with my data?
My problem (2): When I tried the following prediction method:
DMatrixHandle h_test;
XGDMatrixCreateFromMat((float *)features, 1, numFeatures , -999.9f, &h_test);
xgboost::bst_ulong out_len;
int res = XGBoosterPredict(g_modelHandle, h_test, 1, 0, &out_len, (const float**)&scores);
I'm getting completely different scores than by loading the exact same model and using it for prediction (in python).
Whoever helps me achieve consistent results across c++ and python will probably go to heaven. BTW, I need to apply prediction in c++ for a real-time application, otherwise I would use a different language.
To get the DMatrix you can do this:
g_learner->Predict(static_cast<std::shared_ptr<xgboost::DMatrix>*>(h_test)->get(), true, &pred);
For problem (2), I don't have an answer. This is actually the same problem I have. I've got a XGBRegression in python and I obtain different results with the same features in C++.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With