Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running a ProcessPoolExecutor in IPython

I was running a simple multiprocessing example in my IPython interpreter (IPython 7.9.0, Python 3.8.0) on my MacBook and ran into a strange error. Here's what I typed:

[In [1]: from concurrent.futures import ProcessPoolExecutor

[In [2]: executor=ProcessPoolExecutor(max_workers=1)

[In [3]: def func():
             print('Hello')

[In [4]: future=executor.submit(func)

However, I received the following error:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)                                   
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/process.py", line 233, in _process_worker
    call_item = call_queue.get(block=True)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/queues.py", line 116, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'func' on <module '__main__' (built-in)>

Furthermore, trying to submit the job again gave me a different error:

[In [5]: future=executor.submit(func)                                            
---------------------------------------------------------------------------
BrokenProcessPool                         Traceback (most recent call last)
<ipython-input-5-42bad1a6fe80> in <module>
----> 1 future=executor.submit(func)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/process.py in submit(*args, **kwargs)
    627         with self._shutdown_lock:
    628             if self._broken:
--> 629                 raise BrokenProcessPool(self._broken)
    630             if self._shutdown_thread:
    631                 raise RuntimeError('cannot schedule new futures after shutdown')

BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore

As a sanity check, I typed the same (almost) code into a Python file and ran it from the command line (python3 test.py). It worked fine.

Why does IPython have an issue with my test?

EDIT:

Here's the Python file that worked fine.

from concurrent.futures import ProcessPoolExecutor as Executor

def func():
        print('Hello')

if __name__ == '__main__':
        with Executor(1) as executor:
                future=executor.submit(func)
                print(future.result())
like image 432
Daniel Walker Avatar asked May 18 '20 00:05

Daniel Walker


People also ask

How does ThreadPoolExecutor work in Python?

ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls asynchronously. An Executor subclass that uses a pool of at most max_workers threads to execute calls asynchronously.

How do I import concurrent futures?

In this example, we need to start by importing the concurrent. futures module. Then a function named load_url() is created which will load the requested url. The ProcessPoolExecutor is then created with the 5 number of threads in the pool.

How does Executor map work?

The ThreadPoolExecutor map() function supports target functions that take more than one argument by providing more than one iterable as arguments to the call to map(). For example, we can define a target function for map that takes two arguments, then provide two iterables to the call to map().


1 Answers

Ok, finally found out what is going on. The problem is Mac OS - it uses by default "spawn" method to create subprocesses. This is explained here https://docs.python.org/3/library/multiprocessing.html and also the way to change it to fork (though it states fork is unsafe on Mac os).

With spawn method a new Python interpreter is started and your code fed to it. This then tries to locate your function under main but in this case there is no main as there is no program, just interpreted commands.

If you change the start method to fork, your code runs (but note the caveat about this being unsafe)

In [1]: import multiprocessing as mp                                                                                     

In [2]: mp.set_start_method("fork")                                                                                      

In [3]: def func(): 
   ...:     print("foo"); 
   ...:                                               

In [4]: from concurrent.futures import ProcessPoolExecutor                                                               

In [5]: executor=ProcessPoolExecutor(max_workers=1)                                                               

In [6]: future=executor.submit(func)                                                                                     

foo
In [7]:  

I am not sure if the answer is helpful because of the caveat but it explains why it behaves differently when you do have a program (your other attempt) and why it worked fine on Ubuntu - it uses "fork" by default.

like image 200
Hannu Avatar answered Oct 15 '22 15:10

Hannu