Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python scikit-learn: exporting trained classifier

I am using a DBN (deep belief network) from nolearn based on scikit-learn.

I have already built a Network which can classify my data very well, now I am interested in exporting the model for deployment, but I don't know how (I am training the DBN every time I want to predict something). In matlab I would just export the weight matrix and import it in another machine.

Does someone know how to export the model/the weight matrix to be imported without needing to train the whole model again?

like image 968
jcdmb Avatar asked Jul 07 '13 12:07

jcdmb


People also ask

Is Scikitlearn the same as sklearn?

Scikit-learn is also known as sklearn. It's a free and the most useful machine learning library for Python.

What is Joblib sklearn?

This allows sklearn to take full advantage of the multiple cores in your machine and speed up training. Using the Dask joblib backend you can maximize parallelism by scaling your sklearn models training out to a remote cluster for even greater performance gains on your data science workflows.


Video Answer


1 Answers

First, install joblib.

You can use:

>>> import joblib >>> joblib.dump(clf, 'my_model.pkl', compress=9) 

And then later, on the prediction server:

>>> import joblib >>> model_clone = joblib.load('my_model.pkl') 

This is basically a Python pickle with an optimized handling for large numpy arrays. It has the same limitations as the regular pickle w.r.t. code change: if the class structure of the pickle object changes you might no longer be able to unpickle the object with new versions of nolearn or scikit-learn.

If you want long-term robust way of storing your model parameters you might need to write your own IO layer (e.g. using binary format serialization tools such as protocol buffers or avro or an inefficient yet portable text / json / xml representation such as PMML).

like image 151
ogrisel Avatar answered Sep 21 '22 22:09

ogrisel