I'm using joblib
to parallelize my python 3.5 code.
If I do:
from modules import f
from joblib import Parallel, delayed
if __name__ == '__main__':
Parallel( n_jobs = n_jobs,backend = "multiprocessing")(delayed(f)(i) for i in range( 10 ))
code doesn't work. Instead:
from joblib import Parallel, delayed
def f( i ):
# my func ...
if __name__ == '__main__':
Parallel( n_jobs = n_jobs, backend = "multiprocessing")(delayed(f)(i) for i in range(10))
This works!
Can someone explain why I have to put all my functions in the same script?
That is really unpractical, because in modules there are plenty of functions that I coded, that I don't want to copy / paste in the main script.
Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing.
Dependencies. Joblib has no mandatory dependencies besides Python (supported versions are 3.6+). Joblib has an optional dependency on Numpy (at least version 1.6. 1) for array manipulation.
The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax. Under Windows, the use of multiprocessing. Pool requires to protect the main loop of code to avoid recursive spawning of subprocesses when using joblib.
Joblib provides a better way to avoid recomputing the same function repetitively saving a lot of time and computational cost. For example, let's take a simple example below: As seen above, the function is simply computing the square of a number over a range provided.
I faced the similar ussue. When I call function from import, it just freezes and when I call local function it works OK. Solve it by using multithreading instead of multiprocessing like that
Parallel( n_jobs = n_jobs, backend='threading')(delayed(f)(i) for i in range(10))
I found a workaround that allows you to keep the helper functions in separates module. For each imported function that you want to parallelize, define a proxy function in your main module, e.g. as
def f_proxy(*args, **kwargs):
return f(*args, **kwargs)
and simply use delayed(f_proxy)
. It is still somewhat unsatisfactory, but cleaner than moving all helper functions into the main module.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With