Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I use operator.itemgetter in a multiprocessing.Pool?

The following program:

import multiprocessing,operator
f = operator.itemgetter(0)
# def f(*a): return operator.itemgetter(0)(*a)
if __name__ == '__main__':
    multiprocessing.Pool(1).map(f, ["ab"])

fails with the following error:

Process PoolWorker-1:
Traceback (most recent call last):
  File "/usr/lib/python3.2/multiprocessing/process.py", line 267, in _bootstrap
    self.run()
  File "/usr/lib/python3.2/multiprocessing/process.py", line 116, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.2/multiprocessing/pool.py", line 102, in worker
    task = get()
  File "/usr/lib/python3.2/multiprocessing/queues.py", line 382, in get
    return recv()
TypeError: itemgetter expected 1 arguments, got 0

Why do I get the error (on cPython 2.7 and 3.2 on Linux x64), and why does it vanish if I uncomment the third line?

like image 307
phihag Avatar asked Jun 27 '12 21:06

phihag


People also ask

What does operator itemgetter do in python?

Answer for Python beginners that is just what operator. itemgetter(1) will give you: A function that grabs the first item from a list-like object.

What is key Itemgetter Python?

itemgetter() for the key parameter. itemgetter() in the standard library operator returns a callable object that fetches a list element or dictionary value.


1 Answers

The problem here is that the multiprocessing module passes objects by copy into the other processes (obviously), and itemgetter objects are not copyable using any of the obvious means:

In [10]: a = operator.itemgetter(0)
Out[10]: copy.copy(a)
TypeError: itemgetter expected 1 arguments, got 0

In [10]: a = operator.itemgetter(0)
Out[10]: copy.deepcopy(a)
TypeError: itemgetter expected 1 arguments, got 0

In [10]: a = operator.itemgetter(0)
Out[10]: pickle.dumps(a)
TypeError: can't pickle itemgetter objects

# etc.

The problem isn't even attempting to call f inside the other processes; it's trying to copy it in the first place. (If you look at the stack traces, which I omitted above, you'll see a lot more information on why this fails.)

Of course usually this doesn't matter, because it's nearly as easy and efficient to construct a new itemgetter on the fly as to copy one. And this is what your alternative "f" function is doing. (Copying a function that creates an itemgetter on the fly doesn't require copying an itemgetter, of course.)

You could turn "f" into a lambda. Or write a trivial function (named or lambda) that does the same thing without using itemgetter. Or write an itemgetter replacement that is copyable (which obviously wouldn't be all that hard). But you can't directly use itemgetter objects as-is the way you want to.

like image 109
abarnert Avatar answered Jan 02 '23 11:01

abarnert