I have a scikit-learn pipline with kerasRegressor in it: <pre class="prettyprint"><code>estimators = [ ('standardize', StandardScaler()), ('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=5, batch_size=1000, verbose=1)) ] pipeline = Pipeline(estimators) </code></pre> After, training the pipline, I am trying to save to disk using joblib... <pre class="prettyprint"><code>joblib.dump(pipeline, filename , compress=9) </code></pre> But I am getting an error: <blockquote> RuntimeError: maximum recursion depth exceeded </blockquote> How would you save the pipeline to disk?

Keras is not compatible with pickle out of the box. You can fix it if you are willing to monkey patch: https://github.com/tensorflow/tensorflow/pull/39609#issuecomment-683370566. You can also use the SciKeras library which does this for you and is a drop in replacement for <code>KerasClassifier</code>: https://github.com/adriangb/scikeras Disclosure: I am the author of SciKeras as well as that PR.

how to save a scikit-learn pipline with keras regressor inside to disk?

Tags:

python

machine-learning

keras

scikit-learn

joblib

I have a scikit-learn pipline with kerasRegressor in it:

estimators = [
    ('standardize', StandardScaler()),
    ('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=5, batch_size=1000, verbose=1))
    ]
pipeline = Pipeline(estimators)

After, training the pipline, I am trying to save to disk using joblib...

joblib.dump(pipeline, filename , compress=9)

But I am getting an error:

RuntimeError: maximum recursion depth exceeded

How would you save the pipeline to disk?

894

asked Jun 23 '16 06:06

Dror Hilman

2 Answers

I struggled with the same problem as there are no direct ways to do this. Here is a hack which worked for me. I saved my pipeline into two files. The first file stored a pickled object of the sklearn pipeline and the second one was used to store the Keras model:

...
from keras.models import load_model
from sklearn.externals import joblib

...

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('estimator', KerasRegressor(build_model))
])

pipeline.fit(X_train, y_train)

# Save the Keras model first:
pipeline.named_steps['estimator'].model.save('keras_model.h5')

# This hack allows us to save the sklearn pipeline:
pipeline.named_steps['estimator'].model = None

# Finally, save the pipeline:
joblib.dump(pipeline, 'sklearn_pipeline.pkl')

del pipeline

And here is how the model could be loaded back:

# Load the pipeline first:
pipeline = joblib.load('sklearn_pipeline.pkl')

# Then, load the Keras model:
pipeline.named_steps['estimator'].model = load_model('keras_model.h5')

y_pred = pipeline.predict(X_test)

137

answered Oct 02 '22 22:10

constt

Keras is not compatible with pickle out of the box. You can fix it if you are willing to monkey patch: https://github.com/tensorflow/tensorflow/pull/39609#issuecomment-683370566.

You can also use the SciKeras library which does this for you and is a drop in replacement for KerasClassifier: https://github.com/adriangb/scikeras

Disclosure: I am the author of SciKeras as well as that PR.

answered Oct 02 '22 21:10

LoveToCode

Related questions
                            
                                Restarting a program after exception
                            
                                stack trace from manage.py runserver not appearing
                            
                                Enable executing multiple statements while execution via sqlalchemy
                            
                                Pylint W0223: Method ... is abstract in class ... but is not overridden
                            
                                Python3 rounding to nearest even
                            
                                How to partially copy using python an Hdf5 file into a new one keeping the same structure?
                            
                                Pass column name as parameter to PostgreSQL using psycopg2
                            
                                ImportError: cannot import name 'QStringList' in PyQt5
                            
                                How to filter/smooth with SciPy/Numpy?
                            
                                Unable to debug in PyCharm because of an ImportError in pydevconsole.py
                            
                                Fastest way to compare row and previous row in pandas dataframe with millions of rows
                            
                                Why "rv" in Flask testing tutorial? [closed]
                            
                                How to convert a single number into a single item list in python
                            
                                How / why does Python type hinting syntax work?
                            
                                Check database schema matches SQLAlchemy models on application startup
                            
                                Convert pandas freq string to timedelta
                            
                                Pandas type error trying to plot
                            
                                Pandas html: Don't truncate long values
                            
                                pyenv: pip: command not found
                            
                                How do I strip all leading and trailing punctuation in Python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With