Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in use of python multiprocessing module with generator function.

Could some one explain what is wrong with below code

from multiprocessing import Pool
def sq(x):
    yield x**2
p = Pool(2)

n = p.map(sq, range(10))

I am getting following error

MaybeEncodingError Traceback (most recent call last) in () 5 p = Pool(2) 6 ----> 7 n = p.map(sq, range(10))

/home/devil/anaconda3/lib/python3.4/multiprocessing/pool.py in map(self, func, iterable, chunksize) 258 in a list that is returned. 259 ''' --> 260 return self._map_async(func, iterable, mapstar, chunksize).get() 261 262 def starmap(self, func, iterable, chunksize=None):

/home/devil/anaconda3/lib/python3.4/multiprocessing/pool.py in get(self, timeout) 606 return self._value 607 else: --> 608 raise self._value 609 610 def _set(self, i, obj):

MaybeEncodingError: Error sending result: '[, ]'. Reason: 'TypeError("can't pickle generator objects",)'

Many thanks in advance.

like image 527
Manu Avatar asked Feb 04 '17 13:02

Manu


1 Answers

You have to use a function not a generator here. Means: change yield by return to convert sq to a function. Pool can't work with generators.

Moreover, when trying to create a working version on Windows, I had a strange repeating error message.

Attempt to start a new process before the current process
has finished its bootstrapping phase.

This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:

if __name__ == '__main__':

literally quoting the comment I got, since it's self-explanatory:

the error on windows is because each process spawns a new python process which interprets the python file etc. so everything outside the "if main block" is executed again"

so to be portable, you have to use __name__=="__main__" when running this module:

from multiprocessing import Pool

def sq(x):
    return x**2

if __name__=="__main__":
    p = Pool(2)
    n = p.map(sq, range(10))
    print(n)

Result:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Edit: if you don't want to store the values beforehand, you can use imap:

n = p.imap(sq, range(10))

n is now a generator object. To consume the values (and activate the actual processing), I force iteration through a list and I get the same result as above:

print(list(n))

Note that the documentation indicates that imap is much slower than map.

like image 149
Jean-François Fabre Avatar answered Sep 30 '22 15:09

Jean-François Fabre