Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run parts of a ipython notebook in a loop / with different input parameter

I have written a ipython notebook, which analyses a dataset. Now I want to use this code to loop over different datasets.

The code is split into about 50 cells (including comments, markdown explanations,...). Is there a way to run parts of a notebook in a loop or running a whole notebook with different input parameters?

I don't want to merge all cells into one function or download the code as a python script, as I really like to run (and experimenting with) parts of the analysis by executing only certain cells.

Basically its refactoring parts of a script into a function and calling the function in a loop, just that the "parts of the script" are notebook cells.

like image 849
Jan Katins Avatar asked Mar 26 '13 10:03

Jan Katins


People also ask

What does %% do in Jupyter notebook?

Both ! and % allow you to run shell commands from a Jupyter notebook. % is provided by the IPython kernel and allows you to run "magic commands", many of which include well-known shell commands. ! , provided by Jupyter, allows shell commands to be run within cells.

What does %% capture do?

Capturing Output With %%capture IPython has a cell magic, %%capture , which captures the stdout/stderr of a cell. With this magic you can discard these streams or store them in a variable. By default, %%capture discards these streams. This is a simple way to suppress unwanted output.


2 Answers

What I usually do in these scenarios is wrap the important cells as functions (you don't have to merge any of them) and have a certain master cell that iterates over a list of parameters and calls these functions. E.g. this is what a "master cell" looks like in one of my notebooks:

import itertools # parameters P_peak_all = [100, 200] idle_ratio_all = [0., 0.3, 0.6] # iterate through these parameters and call the notebook's logic for P_peak, idle_ratio in itertools.product(P_peak_all, idle_ratio_all):     print(P_peak, idle_ratio, P_peak*idle_ratio)     print('========================')     m_synth, m_synth_ns = build_synth_measurement(P_peak, idle_ratio)     compare_measurements(m_synth, m_synth_ns, "Peak pauser", "No scheduler", file_note="-%d-%d" % (P_peak, int(idle_ratio*100))) 

You can still have some data dragging throughout the notebook (i.e. calling each function at the bottom of the cell with your data) to be able to test stuff live for individual cells. For example some cell might state:

def square(x):     y = x**2     return y square(x) # where x is your data running from the prior cells  

Which lets you experiment live and still call the generic functionality from the master cell.

I know it's some additional work to refactor your notebook using functions, but I found it actually increases my notebook's readability which is useful when you come back to it after a longer period and it's easier to convert it to a "proper" script or module if necessary.

like image 163
metakermit Avatar answered Sep 28 '22 00:09

metakermit


A cheap, but fast trick is to use "run all cells" in a kind of while loop:

Ipython/Jupyter - can we program a "run all cell above"?

change_parameters

your code

put this in your last cell:

display(Javascript('IPython.notebook.execute_all_cells()'))

like image 24
Grek Avatar answered Sep 28 '22 02:09

Grek