Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python multiprocessing returning AttributeError when following documentation code [duplicate]

I decided to try and get into the multiprocessor module to help speed up my program. To figure it out, I tried using some of the code examples on the official python documentation on multiprocessing.

First attempt: Introduction

>>> from multiprocessing import Pool
>>>
>>> def f(x):
...     return x*x
...
>>> if __name__ == '__main__':
...     with Pool(5) as p:
...         print(p.map(f, [1, 2, 3]))
...
Process SpawnPoolWorker-3:
Process SpawnPoolWorker-2:
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Program Files\Python36\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Program Files\Python36\lib\multiprocessing\pool.py", line 108, in worker
    task = get()
  File "C:\Program Files\Python36\lib\multiprocessing\queues.py", line 337, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Program Files\Python36\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Program Files\Python36\lib\multiprocessing\pool.py", line 108, in worker
    task = get()
  File "C:\Program Files\Python36\lib\multiprocessing\queues.py", line 337, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>
Process SpawnPoolWorker-4:
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Program Files\Python36\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Program Files\Python36\lib\multiprocessing\pool.py", line 108, in worker
    task = get()
  File "C:\Program Files\Python36\lib\multiprocessing\queues.py", line 337, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>

Here I assume that the pool function is broken; maybe there is a typo somewhere in the lastest version. So I try some of the more specific code.

Second attempt: Process class code block 2

>>> from multiprocessing import Process
>>> import os
>>>
>>> def info(title):
...     print(title)
...     print('module name:', __name__)
...     print('parent process:', os.getppid())
...     print('process id:', os.getpid())
...
>>> def f(name):
...     info('function f')
...     print('hello', name)
...
>>> if __name__ == '__main__':
...     info('main line')
...     p = Process(target=f, args=('bob',))
...     p.start()
...     p.join()
...
main line
module name: __main__
parent process: 43824
process id: 54888
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Program Files\Python36\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>

At this point I know the underlying error is with the Process function of multiprocessing. However, I think that the extended code might have broken something, so I try the simple code.

Third Attempt Process class code block 1

>>> from multiprocessing import Process
>>>
>>> def f(name):
...     print('hello', name)
...
>>> if __name__ == '__main__':
...     p = Process(target=f, args=('bob',))
...     p.start()
...     p.join()
...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Program Files\Python36\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>

At this point I was desperate. I think that maybe the argument was messing with the Process class.

Final attempt: self-generated code

>>> from multiprocessing import Process
>>>
>>> def f():
...     print('hello')
...
>>> if __name__ == '__main__':
...     p = Process(target=f)
...     p.start()
...     p.join()
...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Program Files\Python36\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>

Now I am totally confused because I do not know why the error is occuring. Could someone help me figure out why I am getting this error every time?

like image 904
Pythonic Guy 21421 Avatar asked Feb 03 '18 03:02

Pythonic Guy 21421


2 Answers

You're in interactive mode. That basically doesn't work with multiprocessing, because the workers have to import __main__ and get something that mostly resembles the main process's __main__. This is one of the many ways in which the multiprocessing API is horribly confusing.

Put your code in a script and run the script.

like image 88
user2357112 supports Monica Avatar answered Nov 14 '22 23:11

user2357112 supports Monica


When multiprocessing is invoked on windows, it uses the spawn strategy for creating new processes.

The parent process starts a fresh python interpreter process.

The rough strategy taken here for function objects that are "pickled" across processes is:

  1. Record the module of the function before creating a new process (in this case f.__module__ => __main__)
  2. Encode that to some representation
  3. In the newly spawned process, initialize the main module (for interactive execution this is an empty module)
  4. "unpickle" the arguments, for functions this means:
    1. import their module
    2. access their function name from that module (where you're getting your AttributeError)

In your case this looks roughly like this:

  1. Record ('__main__', 'f')
  2. encode that
  3. spawn a new process, initialize an empty __main__ module
  4. unpickle (recover __main__ and f)
    1. import __main__ as mod
    2. obj = getattr(mod, 'f') (boom!)

For more details about the specific pickling / unpickling, check out the ForkingPickler

Here's an excerpt:

#
# Try making some callable types picklable
#

def _reduce_method(m):
    if m.__self__ is None:
        return getattr, (m.__class__, m.__func__.__name__)
    else:
        return getattr, (m.__self__, m.__func__.__name__)
class _C:
    def f(self):
        pass
register(type(_C().f), _reduce_method)


def _reduce_method_descriptor(m):
    return getattr, (m.__objclass__, m.__name__)
register(type(list.append), _reduce_method_descriptor)
register(type(int.__add__), _reduce_method_descriptor)

The fix is to put your code into an actual module such that when that is re-initialized on the other side it can import it.

like image 40
Anthony Sottile Avatar answered Nov 14 '22 23:11

Anthony Sottile