1. Why does the following Python code using the <code>concurrent.futures</code> module hang forever? <pre class="prettyprint"><code>import concurrent.futures class A: def f(self): print("called") class B(A): def f(self): executor = concurrent.futures.ProcessPoolExecutor(max_workers=2) executor.submit(super().f) if __name__ == "__main__": B().f() </code></pre> The call raises an invisible exception <code>[Errno 24] Too many open files</code> (to see it, replace the line <code>executor.submit(super().f)</code> with <code>print(executor.submit(super().f).exception())</code>). However, replacing <code>ProcessPoolExecutor</code> with <code>ThreadPoolExecutor</code> prints "called" as expected. 2. Why does the following Python code using the <code>multiprocessing.pool</code> module raise the exception <code>AssertionError: daemonic processes are not allowed to have children</code>? <pre class="prettyprint"><code>import multiprocessing.pool class A: def f(self): print("called") class B(A): def f(self): pool = multiprocessing.pool.Pool(2) pool.apply(super().f) if __name__ == "__main__": B().f() </code></pre> However, replacing <code>Pool</code> with <code>ThreadPool</code> prints "called" as expected. Environment: CPython 3.7, MacOS 10.14.

<code>concurrent.futures.ProcessPoolExecutor</code> and <code>multiprocessing.pool.Pool</code> uses <code>multiprocessing.queues.Queue</code> to pass the work function object from caller to worker process, <code>Queue</code> uses <code>pickle</code> module to serialize/unserialize, but it failed to proper processing bound method object with child class instance: <pre class="prettyprint lang-py prettyprint-override"><code>f = super().f print(f) pf = pickle.loads(pickle.dumps(f)) print(pf) </code></pre> outputs: <pre class="prettyprint lang-py prettyprint-override"><code><bound method A.f of <__main__.B object at 0x104b24da0>> <bound method B.f of <__main__.B object at 0x104cfab38>> </code></pre> <code>A.f</code> becomes <code>B.f</code>, this effectly creates infinite recursive calling <code>B.f</code> to <code>B.f</code> in the worker process. <code>pickle.dumps</code> utilize <code>__reduce__</code> method of bound method object, IMO, its implementation, has no consideration of this scenario, which does not take care of the real <code>func</code> object, but only try to get back from instance <code>self</code> obj (<code>B()</code>) with the simple name (<code>f</code>), which resulting <code>B.f</code>, very likely a bug. good news is, as we know where the issue is, we could fix it by implementing our own reduction function that tries to recreate the bound method object from the original function (<code>A.f</code>) and instance obj (<code>B()</code>): <pre class="prettyprint"><code>import types import copyreg import multiprocessing def my_reduce(obj): return (obj.__func__.__get__, (obj.__self__,)) copyreg.pickle(types.MethodType, my_reduce) multiprocessing.reduction.register(types.MethodType, my_reduce) </code></pre> we could do this because bound method is a descriptor. ps: I have filed a bug report.

Why do ProcessPoolExecutor and Pool crash with a super() call?

Tags:

python

pickle

python-multiprocessing

process-pool

1. Why does the following Python code using the concurrent.futures module hang forever?

import concurrent.futures


class A:

    def f(self):
        print("called")


class B(A):

    def f(self):
        executor = concurrent.futures.ProcessPoolExecutor(max_workers=2)
        executor.submit(super().f)


if __name__ == "__main__":
    B().f()

The call raises an invisible exception [Errno 24] Too many open files (to see it, replace the line executor.submit(super().f) with print(executor.submit(super().f).exception())).

However, replacing ProcessPoolExecutor with ThreadPoolExecutor prints "called" as expected.

2. Why does the following Python code using the multiprocessing.pool module raise the exception AssertionError: daemonic processes are not allowed to have children?

import multiprocessing.pool


class A:

    def f(self):
        print("called")


class B(A):

    def f(self):
        pool = multiprocessing.pool.Pool(2)
        pool.apply(super().f)


if __name__ == "__main__":
    B().f()

However, replacing Pool with ThreadPool prints "called" as expected.

Environment: CPython 3.7, MacOS 10.14.

942

asked Jun 15 '19 11:06

Maggyero

1 Answers

concurrent.futures.ProcessPoolExecutor and multiprocessing.pool.Pool uses multiprocessing.queues.Queue to pass the work function object from caller to worker process, Queue uses pickle module to serialize/unserialize, but it failed to proper processing bound method object with child class instance:

f = super().f
print(f)
pf = pickle.loads(pickle.dumps(f))
print(pf)

outputs:

<bound method A.f of <__main__.B object at 0x104b24da0>>
<bound method B.f of <__main__.B object at 0x104cfab38>>

A.f becomes B.f, this effectly creates infinite recursive calling B.f to B.f in the worker process.

pickle.dumps utilize __reduce__ method of bound method object, IMO, its implementation, has no consideration of this scenario, which does not take care of the real func object, but only try to get back from instance self obj (B()) with the simple name (f), which resulting B.f, very likely a bug.

good news is, as we know where the issue is, we could fix it by implementing our own reduction function that tries to recreate the bound method object from the original function (A.f) and instance obj (B()):

import types
import copyreg
import multiprocessing

def my_reduce(obj):
    return (obj.__func__.__get__, (obj.__self__,))

copyreg.pickle(types.MethodType, my_reduce)
multiprocessing.reduction.register(types.MethodType, my_reduce)

we could do this because bound method is a descriptor.

ps: I have filed a bug report.

198

answered Nov 15 '22 02:11

georgexsh

Related questions
                            
                                Getting error while trying to read csv using pandas Python due to extra column values
                            
                                Find indices of 2D numpy arrays that meet a condition
                            
                                Find div text through div label with beautifulsoup
                            
                                Cannot load python gstreamer elements
                            
                                pandas create multiple dataframes based on duplicate index dataframe
                            
                                Visualizing a heatmap matrix on to an image in OpenCV
                            
                                How can I integrate Python mido and asyncio?
                            
                                Delete rows preceeding and following a row containing NaN in Python?
                            
                                Cannot import name 'Merge' from 'keras.layers'
                            
                                Visual Studio Code - Python - List Index Limit Max 300 - Debugger
                            
                                Why do Python floats have real and imag attributes?
                            
                                How do I view the XML produced by the python-docx package
                            
                                str.translate vs str.replace - When to use which one?
                            
                                Repeated insertions into sqlite database via sqlalchemy causing memory leak?
                            
                                Bring a few columns to the front in a huge Panda DataFrame
                            
                                Enumerating a tuple of indices with itertools.product
                            
                                Google Cloud Composer taking too long to install dependencies
                            
                                pytest-django: Is this the right way to test view with parameters?
                            
                                Python + Flask REST API, how to convert data keys between camelcase and snakecase?
                            
                                Test Google SSO SAML on Localhost

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With