Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save GridSearchCV object?

Lately, I have been working on applying grid search cross validation (sklearn GridSearchCV) for hyper-parameter tuning in Keras with Tensorflow backend. An soon as my model is tuned I am trying to save the GridSearchCV object for later use without success.

The hyper-parameter tuning is done as follows:

x_train, x_val, y_train, y_val = train_test_split(NN_input, NN_target, train_size = 0.85, random_state = 4)

history = History() 
kfold = 10


regressor = KerasRegressor(build_fn = create_keras_model, epochs = 100, batch_size=1000, verbose=1)

neurons = np.arange(10,101,10) 
hidden_layers = [1,2]
optimizer = ['adam','sgd']
activation = ['relu'] 
dropout = [0.1] 

parameters = dict(neurons = neurons,
                  hidden_layers = hidden_layers,
                  optimizer = optimizer,
                  activation = activation,
                  dropout = dropout)

gs = GridSearchCV(estimator = regressor,
                  param_grid = parameters,
                  scoring='mean_squared_error',
                  n_jobs = 1,
                  cv = kfold,
                  verbose = 3,
                  return_train_score=True))

grid_result = gs.fit(NN_input,
                    NN_target,
                    callbacks=[history],
                    verbose=1,
                    validation_data=(x_val, y_val))

Remark: create_keras_model function initializes and compiles a Keras Sequential model.

After the cross validation is performed I am trying to save the grid search object (gs) with the following code:

from sklearn.externals import joblib

joblib.dump(gs, 'GS_obj.pkl')

The error I am getting is the following:

TypeError: can't pickle _thread.RLock objects

Could you please let me know what might be the reason for this error?

Thank you!

P.S.: joblib.dump method works well for saving GridSearchCV objects that are used for the training MLPRegressors from sklearn.

like image 800
E.Thrampoulidis Avatar asked Jul 19 '18 13:07

E.Thrampoulidis


2 Answers

Use

import joblib directly

instead of

from sklearn.externals import joblib

Save objects or results with:

joblib.dump(gs, 'model_file_name.pkl')

and load your results using:

joblib.load("model_file_name.pkl")

Here is a simple working example:


import joblib

#save your model or results
joblib.dump(gs, 'model_file_name.pkl')

#load your model for further usage
joblib.load("model_file_name.pkl")

like image 175
liedji Avatar answered Sep 27 '22 20:09

liedji


Try this:

from sklearn.externals import joblib
joblib.dump(gs.best_estimator_, 'filename.pkl')

If you want to dump your object into one file - use:

joblib.dump(gs.best_estimator_, 'filename.pkl', compress = 1)

Simple Example:

from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.externals import joblib

iris = datasets.load_iris()
parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
svc = svm.SVC()
gs = GridSearchCV(svc, parameters)
gs.fit(iris.data, iris.target)

joblib.dump(gs.best_estimator_, 'filename.pkl')

#['filename.pkl']

EDIT 1:

you can also save the whole object:

joblib.dump(gs, 'gs_object.pkl')
like image 38
seralouk Avatar answered Sep 27 '22 20:09

seralouk