Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pickle or store Jupyter (IPython) notebook session for later

People also ask

How do I save an Ipython notebook?

Saving your edits is simple. There is a disk icon in the upper left of the Jupyter tool bar. Click the save icon and your notebook edits are saved.

Do Jupyter notebooks automatically save?

Saving a NotebookJupyter Notebooks autosave, so you don't have to worry about losing code too much. At the top of the page you can usually see the current save status: Last Checkpoint: 2 minutes ago (unsaved changes)

How do you keep a Jupyter Notebook alive?

This can be done by typing jupyter notebook in the terminal, which will open a browser. Then, navigate to the respective jupyter notebook file in the browser and open it. Click Cell > Run All on the toolbar. All done!


I think Dill answers your question well.

pip install dill

Save a Notebook session:

import dill
dill.dump_session('notebook_env.db')

Restore a Notebook session:

import dill
dill.load_session('notebook_env.db')

Source


(I'd rather comment than offer this as an actual answer, but I need more reputation to comment.)

You can store most data-like variables in a systematic way. What I usually do is store all dataframes, arrays, etc. in pandas.HDFStore. At the beginning of the notebook, declare

backup = pd.HDFStore('backup.h5')

and then store any new variables as you produce them

backup['var1'] = var1

At the end, probably a good idea to do

backup.close()

before turning off the server. The next time you want to continue with the notebook:

backup = pd.HDFStore('backup.h5')
var1 = backup['var1']

Truth be told, I'd prefer built-in functionality in ipython notebook, too. You can't save everything this way (e.g. objects, connections), and it's hard to keep the notebook organized with so much boilerplate codes.


This question is related to: How to cache in IPython Notebook?

To save the results of individual cells, the caching magic comes in handy.

%%cache longcalc.pkl var1 var2 var3
var1 = longcalculation()
....

When rerunning the notebook, the contents of this cell is loaded from the cache.

This is not exactly answering your question, but it might be enough to when the results of all the lengthy calculations are recovered fast. This in combination of hitting the run-all button on top of the notebook is for me a workable solution.

The cache magic cannot save the state of a whole notebook yet. To my knowledge there is no other system yet to resume a "notebook". This would require to save all the history of the python kernel. After loading the notebook, and connecting to a kernel, this information should be loaded.