I have a list with about 800000 elements (small strings) that are loaded into a Queue which is then consumed by different worker processes from a multiprocessing pool. I've found out that both in PyPy and in Python (2.7 and 3.6 respectively), even though I've set the Queue's maxsize explicitly to 0, the Queue in both cases is capped at 32768 elements at any given time, and therefore blocks on the 32768th element.
Why is this happening? I thought they were supposed to be infinite if maxsize is <= 0? I've gone over this StackOverflow question Python Queue raising Full even when infinite but it's the only one of such nature. Is there something else I may be missing?
I've tried the implementations of a multiprocessing Queue where I load a million integers and the queue.put(val) method always blocks on the 32768th value.
from multiprocessing import Queue
q = Queue(maxsize=0)
for i in range(int(1e7)):
q.put(i)
print(i)
I was expecting to be able to insert all 1 million integers into the Queue, but as it turns out, it's unable to hold them all as it blocks on the 32768th integer. I'd love to have some light shed on the specifics of why this may be happening, which may already have been answered in the other StackOverflow question linked above, but it seems that the user that made the answer asked whether we were using a 32bit Python distribution, which is not my case, as I am using a 64bit Python distribution in both cases, as can be seen here (for PyPy with 2.7.13, which is what I'm using in my project):
Python 2.7.13 (990cef41fe11, Mar 21 2019, 12:15:10)
[PyPy 7.1.0 with GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>> import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
('7fffffffffffffff', True)
UPDATE:
I've noticed something very interesting. This happened while running the Queue on MacOS, but I ran the code in a docker container with Linux and the queue was effectively loaded in full with all 800000 elements at once! It seems that this has something to do with MacOS.
Multiprocessing Queue hanging means you're most likely exceeding the maximum semaphore count; i.e. OSX can't count that high and implicitly restricts you to 2^15 - 1.
I'm not sure how reliable/up-to-date the source is, but this is aligned with seemingly what's the max value for osx (https://github.com/st3fan/osx-10.9/blob/master/xnu-2422.1.72/bsd/sys/semaphore.h)
Edit/proof:
Explicitly trying to use a greater limit than 2^15 - 1 fails:
>>> Queue(maxsize=2**15)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/context.py", line 102, in Queue
return Queue(maxsize, ctx=self.get_context())
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/queues.py", line 48, in __init__
self._sem = ctx.BoundedSemaphore(maxsize)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/context.py", line 87, in BoundedSemaphore
return BoundedSemaphore(value, ctx=self.get_context())
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/synchronize.py", line 145, in __init__
SemLock.__init__(self, SEMAPHORE, value, value, ctx=ctx)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/synchronize.py", line 59, in __init__
unlink_now)
OSError: [Errno 22] Invalid argument
In your case, you've tried to implicitly create a queue of infinite size (<=0), however that's not what actually happens and OSX limits are applied: https://github.com/python/cpython/blob/master/Lib/multiprocessing/queues.py#L37
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With