Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Multiprocessing Queue when set to infinite is capped at 32768 (2^15)

I have a list with about 800000 elements (small strings) that are loaded into a Queue which is then consumed by different worker processes from a multiprocessing pool. I've found out that both in PyPy and in Python (2.7 and 3.6 respectively), even though I've set the Queue's maxsize explicitly to 0, the Queue in both cases is capped at 32768 elements at any given time, and therefore blocks on the 32768th element.

Why is this happening? I thought they were supposed to be infinite if maxsize is <= 0? I've gone over this StackOverflow question Python Queue raising Full even when infinite but it's the only one of such nature. Is there something else I may be missing?

I've tried the implementations of a multiprocessing Queue where I load a million integers and the queue.put(val) method always blocks on the 32768th value.

from multiprocessing import Queue
q = Queue(maxsize=0)
for i in range(int(1e7)):
    q.put(i)
    print(i)

I was expecting to be able to insert all 1 million integers into the Queue, but as it turns out, it's unable to hold them all as it blocks on the 32768th integer. I'd love to have some light shed on the specifics of why this may be happening, which may already have been answered in the other StackOverflow question linked above, but it seems that the user that made the answer asked whether we were using a 32bit Python distribution, which is not my case, as I am using a 64bit Python distribution in both cases, as can be seen here (for PyPy with 2.7.13, which is what I'm using in my project):

Python 2.7.13 (990cef41fe11, Mar 21 2019, 12:15:10)
[PyPy 7.1.0 with GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>> import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
('7fffffffffffffff', True)

UPDATE:

I've noticed something very interesting. This happened while running the Queue on MacOS, but I ran the code in a docker container with Linux and the queue was effectively loaded in full with all 800000 elements at once! It seems that this has something to do with MacOS.

like image 857
Juan Amari Avatar asked May 30 '19 12:05

Juan Amari


1 Answers

Multiprocessing Queue hanging means you're most likely exceeding the maximum semaphore count; i.e. OSX can't count that high and implicitly restricts you to 2^15 - 1.

I'm not sure how reliable/up-to-date the source is, but this is aligned with seemingly what's the max value for osx (https://github.com/st3fan/osx-10.9/blob/master/xnu-2422.1.72/bsd/sys/semaphore.h)

Edit/proof:

Explicitly trying to use a greater limit than 2^15 - 1 fails:

>>> Queue(maxsize=2**15)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/context.py", line 102, in Queue
    return Queue(maxsize, ctx=self.get_context())
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/queues.py", line 48, in __init__
    self._sem = ctx.BoundedSemaphore(maxsize)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/context.py", line 87, in BoundedSemaphore
    return BoundedSemaphore(value, ctx=self.get_context())
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/synchronize.py", line 145, in __init__
    SemLock.__init__(self, SEMAPHORE, value, value, ctx=ctx)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/synchronize.py", line 59, in __init__
    unlink_now)
OSError: [Errno 22] Invalid argument

In your case, you've tried to implicitly create a queue of infinite size (<=0), however that's not what actually happens and OSX limits are applied: https://github.com/python/cpython/blob/master/Lib/multiprocessing/queues.py#L37

like image 166
leongold Avatar answered Oct 16 '22 16:10

leongold