From the XGBoost guide:
After training, the model can be saved.
bst.save_model('0001.model')
The model and its feature map can also be dumped to a text file.
# dump model bst.dump_model('dump.raw.txt') # dump model with feature map bst.dump_model('dump.raw.txt', 'featmap.txt')
A saved model can be loaded as follows:
bst = xgb.Booster({'nthread': 4}) # init model bst.load_model('model.bin') # load data
My questions are following.
save_model
& dump_model
?'0001.model'
and 'dump.raw.txt','featmap.txt'
?model.bin
is different from the name to be saved 0001.model
?model_A
and model_B
. I wanted to save both models for future use. Which save
& load
function should I use? Could you help show the clear process?With the 30 day savings rule, you defer all non-essential purchases and impulse buys for 30 days. Instead of spending your money on something you might not need, you're going to take 30 days to think about it. At the end of this 30 day period, if you still want to make that purchase, feel free to go for it.
Here is how I solved the problem:
import pickle
file_name = "xgb_reg.pkl"
# save
pickle.dump(xgb_model, open(file_name, "wb"))
# load
xgb_model_loaded = pickle.load(open(file_name, "rb"))
# test
ind = 1
test = X_val[ind]
xgb_model_loaded.predict(test)[0] == xgb_model.predict(test)[0]
Out[1]: True
Both functions save_model
and dump_model
save the model, the difference is that in dump_model
you can save feature name and save tree in text format.
The load_model
will work with model from save_model
. The model from dump_model
can be used for example with xgbfi.
During loading the model, you need to specify the path where your models is saved. In the example bst.load_model("model.bin")
model is loaded from file model.bin
- it is just a name of file with model. Good luck!
EDIT: From Xgboost documentation (for version 1.3.3
), the dump_model()
should be used for saving the model for further interpretation. For saving and loading the model the save_model()
and load_model()
should be used. Please check the docs for more details.
There is also a difference between Learning API
and Scikit-Learn API
of Xgboost. The latter saves the best_ntree_limit
variable which is set during the training with early stopping. You can read details in my article How to save and load Xgboost in Python?
The save_model()
method recognize the format of the file name, if *.json
is specified, then model is saved in JSON, otherwise it is text file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With