Sometimes it takes a long time to run a single cell, while it is running, I would like to write and run other cells in the same notebook, accessing the variables in the same context.
Is there any ipython magic that can be used such that when it is added to a cell, running the cell will automatically create a new thread and run with shared global data in the notebook?
While in command mode: A to insert a new cell above the current cell, B to insert a new cell below. M to change the current cell to Markdown, Y to change it back to code. D + D (press the key twice) to delete the current cell.
If you're having trouble adding newlines in the middle of a multiline code block in IPython, try using Ctrl-V Ctrl-J to insert a linebreak at the cursor. It will not execute the code block like hitting Enter will.
You can create new notebooks from the dashboard with the New Notebook button, or open existing ones by clicking on their name. You can also drag and drop . ipynb notebooks and standard . py Python source code files into the notebook list area.
It may not be an answer, but rather the direction to it. I did not saw anything like that, still I'm interested in this too.
My current findings suggesting that one need to define it's own custom cell magic. Good references would be the custom cell magic section in the documentation and two examples that I would consider:
Both those links wrapping the code in a thread. That could be a starting point.
UPDATE: ngcm-tutorial at github has description of background jobs class
##github.com/jupyter/ngcm-tutorial/blob/master/Day-1/IPython%20Kernel/Background%20Jobs.ipynb from IPython.lib import backgroundjobs as bg jobs = bg.BackgroundJobManager() def printfunc(interval=1, reps=5): for n in range(reps): time.sleep(interval) print('In the background... %i' % n) sys.stdout.flush() print('All done!') sys.stdout.flush() jobs.new('printfunc(1,3)') jobs.status()
UPDATE 2: Another option:
from IPython.display import display from ipywidgets import IntProgress import threading class App(object): def __init__(self, nloops=2000): self.nloops = nloops self.pb = IntProgress(description='Thread loops', min=0, max=self.nloops) def start(self): display(self.pb) while self.pb.value < self.nloops: self.pb.value += 1 self.pb.color = 'red' app = App(nloops=20000) t = threading.Thread(target=app.start) t.start() #t.join()
Here is a little snippet that I came up with
def jobs_manager(): from IPython.lib.backgroundjobs import BackgroundJobManager from IPython.core.magic import register_line_magic from IPython import get_ipython jobs = BackgroundJobManager() @register_line_magic def job(line): ip = get_ipython() jobs.new(line, ip.user_global_ns) return jobs
It uses IPython builtin module IPython.lib.backgroundjobs
. So code is small and simple and no new dependencies are introduced.
I use it like this:
jobs = jobs_manager() %job [fetch_url(_) for _ in urls] # saves html file to disk Starting job # 0 in a separate thread.
Then you can monitor the state with:
jobs.status() Running jobs: 1 : [fetch_url(_) for _ in urls] Dead jobs: 0 : [fetch_url(_) for _ in urls]
If job fails you can inspect stack trace with
jobs.traceback(0)
There is no way to kill a job. So I carefully use this dirty hack:
def kill_thread(thread): import ctypes id = thread.ident code = ctypes.pythonapi.PyThreadState_SetAsyncExc( ctypes.c_long(id), ctypes.py_object(SystemError) ) if code == 0: raise ValueError('invalid thread id') elif code != 1: ctypes.pythonapi.PyThreadState_SetAsyncExc( ctypes.c_long(id), ctypes.c_long(0) ) raise SystemError('PyThreadState_SetAsyncExc failed')
It raises SystemError
in a given thread. So to kill a job I do
kill_thread(jobs.all[1])
To kill all running jobs I do
for thread in jobs.running: kill_thread(thread)
I like to use %job
with widget-based progress bar https://github.com/alexanderkuk/log-progress like this:
%job [fetch_url(_) for _ in log_progress(urls, every=1)]
http://g.recordit.co/iZJsJm8BOL.gif
One can even use %job
instead of multiprocessing.TreadPool
:
for chunk in get_chunks(urls, 3): %job [fetch_url(_) for _ in log_progress(chunk, every=1)]
http://g.recordit.co/oTVCwugZYk.gif
Some obvious problems with this code:
You can not use arbitrary code in %job
. There can be no assignments and not prints for example. So I use it with routines that store results on hard drive
Sometimes dirty hack in kill_thread
does not work. I think that is why IPython.lib.backgroundjobs
does not have this functionality by design. If thread is doing some system call like sleep
or read
exception is ignored.
It uses threads. Python has GIL , so %job
can not be used for some heavy computations that take in python byte code
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With