Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python multiprocessing Pool with map_async

I trying to use the multiprocessing package in python with a Pool.

I have the function f which is called by the map_async function:

from multiprocessing import Pool

def f(host, x):
    print host
    print x

hosts = ['1.1.1.1', '2.2.2.2']
pool = Pool(processes=5)
pool.map_async(f,hosts,"test")
pool.close()
pool.join()

This code has the next error:

Traceback (most recent call last):
  File "pool-test.py", line 9, in <module>
    pool.map_async(f,hosts,"test")
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 290, in map_async
    result = MapResult(self._cache, chunksize, len(iterable), callback)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 557, in __init__
    self._number_left = length//chunksize + bool(length % chunksize)
TypeError: unsupported operand type(s) for //: 'int' and 'str'

I don't know how to pass more than 1 argument to the f function. Are there any way?

like image 806
dseira Avatar asked May 14 '13 11:05

dseira


People also ask

What is Map_async?

map_async provides exactly that - an asynchronous parallel map. Create a new python script called asyncmap.py and copy into it.

How do you pass multiple arguments in multiprocessing Python?

Use Pool. The multiprocessing pool starmap() function will call the target function with multiple arguments. As such it can be used instead of the map() function. This is probably the preferred approach for executing a target function in the multiprocessing pool that takes multiple arguments.

Is Imap_unordered faster than imap?

imap_unordered instead of pool. imap will not have a large effect on the total running time of your code. It might be a little faster, but not by too much. What it may do, however, is make the interval between values being available in your iteration more even.

What is Chunksize in multiprocessing?

The “chunksize” is an argument specified in a function to the multiprocessing pool when issuing many tasks.


1 Answers

"test" is interpreted as map_async's chunksize keyword argument (see the docs).

Your code should probably be (here copy-pasted from my IPython session) :

from multiprocessing import Pool

def f(arg):
    host, x = arg
    print host
    print x

hosts = ['1.1.1.1', '2.2.2.2']
args = ((host, "test") for host in hosts)
pool = Pool(processes=5)
pool.map_async(f, args)
pool.close()
pool.join()
## -- End pasted text --

1.1.1.1
test
2.2.2.2
test

Note: In Python 3 you can use starmap, which will unpack the arguments from the tuples. You'll be able to avoid doing host, x = arg explicitely.

like image 165
F.X. Avatar answered Oct 27 '22 14:10

F.X.