Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save python random forest model to file

In R, after running "random forest" model, I can use save.image("***.RData") to store the model. Afterwards, I can just load the model to do predictions directly.

Can you do a similar thing in python? I separate the Model and Prediction into two files. And in Model file:

rf= RandomForestRegressor(n_estimators=250, max_features=9,compute_importances=True) fit= rf.fit(Predx, Predy) 

I tried to return rf or fit, but still can't load the model in the prediction file.

Can you separate the model and prediction using the sklearn random forest package?

like image 370
user3013706 Avatar asked Dec 18 '13 15:12

user3013706


People also ask

How do I save a model as a pickle file?

To save the model all we need to do is pass the model object into the dump() function of Pickle. This will serialize the object and convert it into a “byte stream” that we can save as a file called model.

How do you pickle a random forest model?

Steps involved in random forest algorithm: Step 1: In Random forest n number of random records are taken from the data set having k number of records. Step 2: Individual decision trees are constructed for each sample. Step 3: Each decision tree will generate an output.

How do you save the isolation forest model?

For example to save some attributes of the model, so the next time it isn't necessary to call again the fit function to train my model. For example, for GMM I would save the weights_ , means_ and covs_ of each component, so for later I wouldn't need to train the model again.


2 Answers

... import cPickle  rf = RandomForestRegresor() rf.fit(X, y)  with open('path/to/file', 'wb') as f:     cPickle.dump(rf, f)   # in your prediction file                                                                                                                                                                                                             with open('path/to/file', 'rb') as f:     rf = cPickle.load(f)   preds = rf.predict(new_X) 
like image 143
Jake Burkhead Avatar answered Oct 08 '22 04:10

Jake Burkhead


You can use joblib to save and load the Random Forest from scikit-learn (in fact, any model from scikit-learn)

The example:

import joblib from sklearn.ensemble import RandomForestClassifier # create RF rf = RandomForestClassifier() # fit on some data rf.fit(X, y)  # save joblib.dump(rf, "my_random_forest.joblib")  # load loaded_rf = joblib.load("my_random_forest.joblib")  

What is more, the joblib.dump has compress argument, so the model can be compressed. I made very simple test on iris dataset and compress=3 reduces the size of the file about 5.6 times.

like image 38
pplonski Avatar answered Oct 08 '22 06:10

pplonski