If you open a Jupyter Notebook and run this:
import multiprocessing
def f(x):
a = 3 * x
pool = multiprocessing.Pool(processes=1)
global g
def g(j):
return a * j
return pool.map(g, range(5))
f(1)
You will get the following errors
Process ForkPoolWorker-1:
Traceback (most recent call last):
File "/Users/me/anaconda3/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/Users/me/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/Users/me/anaconda3/lib/python3.5/multiprocessing/pool.py", line 108, in worker
task = get()
File "/Users/me/anaconda3/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
AttributeError: Can't get attribute 'g' on <module '__main__'>
and I'm trying to understand if this is a bug or a feature.
I'm trying to get this working because in my real case f is basically a for loop easily parallelizable (you only change one parameter each iteration) but that takes a lot of time on each iteration! Am I approaching the problem correctly or is there an easier way? (Note: Throughout the notebook f will be called several times with different parameters itself)
It works just fine if you define g outside of f.
import multiprocessing
def g(j):
return 4 * j
def f():
pool = multiprocessing.Pool(processes=1)
return pool.map(g, range(5))
f()
Edit: In example you put in your question callable object will look somewhat like this:
class Calculator():
def __init__(self, j):
self.j = j
def __call__(self, x):
return self.j*x
and your function f becomes something like this:
def f(j):
calculator = Calculator(j)
pool = multiprocessing.Pool(processes=1)
return pool.map(calculator, range(5))
I in this case it works just fine. Hope it helped.
If you want to apply g to more arguments than only the iterator element passed by pool.map you can use functools.partial like this:
import multiprocessing
import functools
def g(a, j):
return a * j
def f(x):
a = 3 * x
pool = multiprocessing.Pool(processes=1)
g_with_a = functools.partial(g, a)
return pool.map(g_with_a, range(5))
f(1)
What functools.partial does, is to take a function and an arbitrary number of arguments (both by position and keyword) and returns a new function that behaves like the function you passed to it, but only takes the arguments you didn't pass to partial.
The function returned by partial can be pickled without problems i. e. passed to pool.map, as long as you're using python3.
This is essentially the same as Darth Kotik described in his answer, but you don't have to implement the Calculator class yourself, as partial already does what you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With