Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save & load xgboost model? [closed]

From the XGBoost guide:

After training, the model can be saved.

bst.save_model('0001.model')

The model and its feature map can also be dumped to a text file.

# dump model
bst.dump_model('dump.raw.txt')
# dump model with feature map
bst.dump_model('dump.raw.txt', 'featmap.txt')

A saved model can be loaded as follows:

bst = xgb.Booster({'nthread': 4})  # init model
bst.load_model('model.bin')  # load data

My questions are following.

  1. What's the difference between save_model & dump_model?
  2. What's the difference between saving '0001.model' and 'dump.raw.txt','featmap.txt'?
  3. Why the model name for loading model.bin is different from the name to be saved 0001.model?
  4. Suppose that I trained two models: model_A and model_B. I wanted to save both models for future use. Which save & load function should I use? Could you help show the clear process?
like image 662
Pengju Zhao Avatar asked Apr 29 '17 03:04

Pengju Zhao


People also ask

What is the 30 day rule?

With the 30 day savings rule, you defer all non-essential purchases and impulse buys for 30 days. Instead of spending your money on something you might not need, you're going to take 30 days to think about it. At the end of this 30 day period, if you still want to make that purchase, feel free to go for it.


2 Answers

Here is how I solved the problem:

import pickle
file_name = "xgb_reg.pkl"

# save
pickle.dump(xgb_model, open(file_name, "wb"))

# load
xgb_model_loaded = pickle.load(open(file_name, "rb"))

# test
ind = 1
test = X_val[ind]
xgb_model_loaded.predict(test)[0] == xgb_model.predict(test)[0]

Out[1]: True
like image 186
ChrisDanger Avatar answered Oct 17 '22 07:10

ChrisDanger


Both functions save_model and dump_model save the model, the difference is that in dump_model you can save feature name and save tree in text format.

The load_model will work with model from save_model. The model from dump_model can be used for example with xgbfi.

During loading the model, you need to specify the path where your models is saved. In the example bst.load_model("model.bin") model is loaded from file model.bin - it is just a name of file with model. Good luck!

EDIT: From Xgboost documentation (for version 1.3.3), the dump_model() should be used for saving the model for further interpretation. For saving and loading the model the save_model() and load_model() should be used. Please check the docs for more details.

There is also a difference between Learning API and Scikit-Learn API of Xgboost. The latter saves the best_ntree_limit variable which is set during the training with early stopping. You can read details in my article How to save and load Xgboost in Python?

The save_model() method recognize the format of the file name, if *.json is specified, then model is saved in JSON, otherwise it is text file.

like image 37
pplonski Avatar answered Oct 17 '22 06:10

pplonski