python multiprocessing - process hangs on join for large queue

Tags:

I'm running python 2.7.3 and I noticed the following strange behavior. Consider this minimal example:

from multiprocessing import Process, Queue  def foo(qin, qout):     while True:         bar = qin.get()         if bar is None:             break         qout.put({'bar': bar})  if __name__ == '__main__':     import sys      qin = Queue()     qout = Queue()     worker = Process(target=foo,args=(qin,qout))     worker.start()      for i in range(100000):         print i         sys.stdout.flush()         qin.put(i**2)      qin.put(None)     worker.join()

When I loop over 10,000 or more, my script hangs on worker.join(). It works fine when the loop only goes to 1,000.

Any ideas?

799

asked Feb 08 '14 04:02

user545424

2 Answers

The qout queue in the subprocess gets full. The data you put in it from foo() doesn't fit in the buffer of the OS's pipes used internally, so the subprocess blocks trying to fit more data. But the parent process is not reading this data: it is simply blocked too, waiting for the subprocess to finish. This is a typical deadlock.

145

answered Sep 24 '22 15:09

Armin Rigo

There must be a limit on the size of queues. Consider the following modification:

from multiprocessing import Process, Queue  def foo(qin,qout):     while True:         bar = qin.get()         if bar is None:             break         #qout.put({'bar':bar})  if __name__=='__main__':     import sys      qin=Queue()     qout=Queue()   ## POSITION 1     for i in range(100):         #qout=Queue()   ## POSITION 2         worker=Process(target=foo,args=(qin,))         worker.start()         for j in range(1000):             x=i*100+j             print x             sys.stdout.flush()             qin.put(x**2)          qin.put(None)         worker.join()      print 'Done!'

This works as-is (with qout.put line commented out). If you try to save all 100000 results, then qout becomes too large: if I uncomment out the qout.put({'bar':bar}) in foo, and leave the definition of qout in POSITION 1, the code hangs. If, however, I move qout definition to POSITION 2, then the script finishes.

So in short, you have to be careful that neither qin nor qout becomes too large. (See also: Multiprocessing Queue maxsize limit is 32767)

answered Sep 26 '22 15:09

amd

Related questions
                            
                                ValueError: Client secrets must be for a web or installed app
                            
                                How do I get the current file, current class, and current method with Python?
                            
                                Child processes created with python multiprocessing module won't print
                            
                                Inheriting from Frame or not in a Tkinter application
                            
                                How to ignore directories when running Django collectstatic?
                            
                                ArgumentParser epilog and description formatting in conjunction with ArgumentDefaultsHelpFormatter
                            
                                Looping over elements of named tuple in python
                            
                                PyCharm "no module named sys"
                            
                                subprocess.Popen in different console
                            
                                How to kill a python child process created with subprocess.check_output() when the parent dies?
                            
                                Python Multiprocessing with Distributed Cluster
                            
                                How do I time script execution time in PyCharm without adding code every time?
                            
                                simplest python equivalent to R's gsub
                            
                                Should TensorFlow users prefer SavedModel over Checkpoint or GraphDef?
                            
                                Key-ordered dict in Python
                            
                                Python - ElementTree- cannot use absolute path on element
                            
                                Calling statistics functions from Scipy
                            
                                Setting the Windows taskbar icon in PyQt
                            
                                What tools are available to auto-produce documentation for a REST API written in Flask? [closed]
                            
                                Pandas - how to convert r dataframe back to pandas?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python multiprocessing - process hangs on join for large queue

Tags:

python

process

multiprocessing

queue

user545424

People also ask

2 Answers

Armin Rigo

amd

Recent Activity

Donate For Us