So I am using joblib to parallelize some code and I noticed that I couldn't print things when using it inside a jupyter notebook.
I tried using doing the same example in ipython and it worked perfectly.
Here is a minimal (not) working example to write in a jupyter notebook cell
from joblib import Parallel, delayed
Parallel(n_jobs=8)(delayed(print)(i) for i in range(10))
So I am getting the output as [None, None, None, None, None, None, None, None, None, None]
but nothing is printed.
What I expect to see (print order could be random in reality):
1
2
3
4
5
6
7
8
9
10
[None, None, None, None, None, None, None, None, None, None]
You can see the prints in the logs of the notebook process. But I would like the prints to happen in the notebook, not the logs of the notebook process.
I have opened a Github issue, but with minimal attention so far.
Yes joblib should work in interactive jupyter sessions (for interactively defined Python functions with picklable arguments).
The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax. Under Windows, the use of multiprocessing. Pool requires to protect the main loop of code to avoid recursive spawning of subprocesses when using joblib.
Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing.
I think this caused in part by the way Parallel
spawns the child workers, and how Jupyter Notebook handles IO for those workers. When started without specifying a value for backend
, Parallel
will default to loky
which utilizes a pooling strategy that directly uses a fork-exec model to create the subprocesses.
If you start Notebook from a terminal using
$ jupyter-notebook
the regular stderr
and stdout
streams appear to remain attached to that terminal, while the notebook session will start in a new browser window. Running the posted code snippet in the notebook does produce the expected output, but it seems to go to stdout
and ends up in the terminal (as hinted in the Note in the question). This further supports the suspicion that this behavior is caused by the interaction between loky
and notebook, and the way the standard IO streams are handled by notebook for child processes.
This lead me to this discussion on github (active within the past 2 weeks as of this posting) where the authors of notebook appear to be aware of this, but it would seem that there is no obvious and quick fix for the issue at the moment.
If you don't mind switching the backend that Parallel
uses to spawn children, you can do so like this:
from joblib import Parallel, delayed
Parallel(n_jobs=8, backend='multiprocessing')(delayed(print)(i) for i in range(10))
with the multiprocessing
backend, things work as expected. threading
looks to work fine too. This may not be the solution you were hoping for, but hopefully it is sufficient while the notebook authors work on finding a proper solution.
I'll cross-post this to GitHub in case anyone there cares to add to this answer (I don't want to misstate anyone's intent or put words in people mouths!).
Test Environment:
MacOS - Mojave (10.14)
Python - 3.7.3
pip3 - 19.3.1
Tested in 2 configurations. Confirmed to produce the expected output when using both multiprocessing
and threading
for the backend
parameter. Packages install using pip3
.
Setup 1:
ipykernel 5.1.1
ipython 7.5.0
jupyter 1.0.0
jupyter-client 5.2.4
jupyter-console 6.0.0
jupyter-core 4.4.0
notebook 5.7.8
Setup 2:
ipykernel 5.1.4
ipython 7.12.0
jupyter 1.0.0
jupyter-client 5.3.4
jupyter-console 6.1.0
jupyter-core 4.6.2
notebook 6.0.3
I also was successful using the same versions as 'Setup 2' but with the notebook
package version downgraded to 6.0.2.
This approach works inconsistently on Windows. Different combinations of software versions yield different results. Doing the most intuitive thing-- upgrading everything to the latest version-- does not guarantee it will work.
In Z4-tier's git link scottgigante's method work in Windows, but opposite to the mentined results: in Jupyter notebook, the "multiprocessing" backend hang forever, but the default loky work well (python 3.8.5 and notebook 6.1.1):
from joblib import Parallel, delayed
import sys
def g(x):
stream = getattr(sys, "stdout")
print("{}".format(x), file=stream)
stream.flush()
return x
Parallel(n_jobs=2)(delayed(g)(x**2) for x in range(5))
executed in 91ms, finished 11:17:25 2021-05-13
[0, 1, 4, 9, 16]
A simpler method is to use the identity function in delay:
Parallel(n_jobs=2)(delayed(lambda y:y)([np.log(x),np.sin(x)]) for x in range(5))
executed in 151ms, finished 09:34:18 2021-05-17
[[-inf, 0.0],
[0.0, 0.8414709848078965],
[0.6931471805599453, 0.9092974268256817],
[1.0986122886681098, 0.1411200080598672],
[1.3862943611198906, -0.7568024953079282]]
Or use like this:
Parallel(n_jobs=2)(delayed(lambda y:[np.log(y),np.sin(y)])(x) for x in range(5))
executed in 589ms, finished 09:44:57 2021-05-17
[[-inf, 0.0],
[0.0, 0.8414709848078965],
[0.6931471805599453, 0.9092974268256817],
[1.0986122886681098, 0.1411200080598672],
[1.3862943611198906, -0.7568024953079282]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With