Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this implementation of multiprocessing.pool not work?

Here is the code I am using:

def initFunction(arg1, arg2):
    def funct(value):
        return arg1 * arg2 * value
    return funct

os.system("taskset -p 0xff %d" % os.getpid()) 
pool = Pool(processes=4)
t = np.linspace(0,1,10e3)

a,b,c,d,e,f,g,h = sy.symbols('a,b,c,d,e,f,g,h',commutative=False)

arg1 = sy.Matrix([[a,b],[c,d]])
arg2 = sy.Matrix([[e,f],[g,h]])
myFunct = initFunction(arg1, arg2)

m3 = map(myFunct,t) # this works
m4 = pool.map(myFunct,t) # this does NOT work

The error I'm getting is:

Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 540, in runfile
      execfile(filename, namespace)
   File "/home/justin/Research/mapTest.py", line 46, in <module>
      m4 = pool.map(myFunct,t) 
   File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
      return self.map_async(func, iterable, chunksize).get()
   File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
      raise self._value
cPickle.PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

So what does this error mean and how can I multiprocess this map function?

like image 575
bynx Avatar asked Jul 14 '14 00:07

bynx


People also ask

How do processes pools work in multiprocessing?

Pool allows multiple jobs per process, which may make it easier to parallel your program. If you have a numbers jobs to run in parallel, you can make a Pool with number of processes the same number of as CPU cores and after that pass the list of the numbers jobs to pool. map.

When would you use a multiprocessing pool?

Python multiprocessing Pool can be used for parallel execution of a function across multiple input values, distributing the input data across processes (data parallelism).

How does multiprocessing process work?

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.


1 Answers

Objects that you pass between processes when using multiprocessing must be importable from the __main__ module, so that they can be unpickled in the child. Nested functions, like funct, are not importable from __main__, so you get that error. You can achieve what you're trying by using a functools.partial instead:

from multiprocessing import Pool
from functools import partial

def funct(arg1, arg2, value):
    return arg1 * arg2 * value


if __name__ == "__main__":
    t = [1,2,3,4]
    arg1 = 4 
    arg2 = 5 

    pool = Pool(processes=4)
    func = partial(funct, arg1, arg2)
    m4 = pool.map(func,t)
    print(m4)

Output:

[20, 40, 60, 80]
like image 154
dano Avatar answered Oct 05 '22 23:10

dano