Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing in python blocked

I'm using multiprocessing in my project. I have a worker function which put in a queue the results. Everything works fine. But as size of x increases (in my case x is an array) something gone wrong. Here is a simplified version of my code:

def do_work(queue, x):
    result = heavy_computation_function(x)
    queue.put(result)   # PROBLEM HERE

def parallel_something():
    queue = Queue()
    procs = [Process(target=do_work, args=i) for i in xrange(20)]
    for p in procs: p.start()
    for p in procs: p.join()

    results = []
    while not queue.empty():
        results.append(queue.get)

    return results

I see in the system monitor the python processes working, but then something happen and all processes are running but doing nothing. This is what I get when typing ctrl-D.

    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt

I do some tests. And the problem looks like to be in putting results in the queue in fact if I don't put the results everything works but then there would be no purpose.

like image 695
blueSurfer Avatar asked Nov 30 '12 16:11

blueSurfer


1 Answers

You are most probably generating a deadlock.

From the programming guidelines:

This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be joined automatically.

A possible fix is also proposed in the page. Keep in mind that if processes aren't joined, it doesn't mean they are "occupying" resources in any sense. This means that you could get the queued data out after the processes have completed their operation (perhaps using locks) and only later join the processes.

like image 118
fabioedoardoluigialberto Avatar answered Sep 28 '22 02:09

fabioedoardoluigialberto