Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the callback function work in multiprocessing map_async?

It cost me a whole night to debug my code, and I finally found this tricky problem. Please take a look at the code below.

from multiprocessing import Pool  def myfunc(x):     return [i for i in range(x)]  pool=Pool()  A=[] r = pool.map_async(myfunc, (1,2), callback=A.extend) r.wait() 

I thought I would get A=[0,0,1], but the output is A=[[0],[0,1]]. This does not make sense to me because if I have A=[], A.extend([0]) and A.extend([0,1]) will give me A=[0,0,1]. Probably the callback works in a different way. So my question is how to get A=[0,0,1] instead of [[0],[0,1]]?

like image 229
user2727768 Avatar asked Oct 31 '13 05:10

user2727768


People also ask

How does multiprocessing pool work?

The Pool class in multiprocessing can handle an enormous number of processes. It allows you to run multiple jobs per process (due to its ability to queue the jobs). The memory is allocated only to the executing processes, unlike the Process class, which allocates memory to all the processes.

How does Python multiprocessing work?

The multiprocessing Python module contains two classes capable of handling tasks. The Process class sends each task to a different processor, and the Pool class sends sets of tasks to different processors.

Which is the method used to change the default way to create child processes in multiprocessing?

Python provides the ability to create and manage new processes via the multiprocessing. Process class. In multiprocessing programming, we may need to change the technique used to start child processes. This is called the start method.

What is multiprocessing dummy?

multiprocessing. dummy replicates the API of multiprocessing but is no more than a wrapper around the threading module. That means you're restricted by the Global Interpreter Lock (GIL), and only one thread can actually execute CPU-bound operations at a time. That's going to keep you from fully utilizing your CPUs.


1 Answers

Callback is called once with the result ([[0], [0, 1]]) if you use map_async.

>>> from multiprocessing import Pool >>> def myfunc(x): ...     return [i for i in range(x)] ...  >>> A = [] >>> def mycallback(x): ...     print('mycallback is called with {}'.format(x)) ...     A.extend(x) ...  >>> pool=Pool() >>> r = pool.map_async(myfunc, (1,2), callback=mycallback) >>> r.wait() mycallback is called with [[0], [0, 1]] >>> print(A) [[0], [0, 1]] 

Use apply_async if you want callback to be called for each time.

pool=Pool() results = [] for x in (1,2):     r = pool.apply_async(myfunc, (x,), callback=mycallback)     results.append(r) for r in results:     r.wait() 
like image 198
falsetru Avatar answered Sep 30 '22 18:09

falsetru