Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing Bomb

I was working the following example from Doug Hellmann tutorial on multiprocessing:

import multiprocessing  def worker():     """worker function"""     print 'Worker'     return  if __name__ == '__main__':     jobs = []     for i in range(5):         p = multiprocessing.Process(target=worker)         jobs.append(p)         p.start() 

When I tried to run it outside the if statement:

import multiprocessing  def worker():     """worker function"""     print 'Worker'     return  jobs = [] for i in range(5):     p = multiprocessing.Process(target=worker)     jobs.append(p)     p.start() 

It started spawning processes non-stop, and the only way to stop it was reboot!

Why would that happen? Why it did not generate 5 processes and exit? Why do I need the if statement?

like image 518
Ηλίας Avatar asked Apr 23 '10 09:04

Ηλίας


2 Answers

On Windows there is no fork() routine, so multiprocessing imports the current module to get access to the worker function. Without the if statement the child process starts its own children and so on.

like image 189
Denis Otkidach Avatar answered Sep 23 '22 04:09

Denis Otkidach


Note that the documentation mentions that you need the if statement on windows (here).

However, the documentation doesn't say that this kills your machine almost instantly, requiring a reboot. So this can be quite confusing, especially if the use of multiprocessing happens in some function deep inside the code. No matter how deeply hidden it is, you still need the if check in the main program file. This pretty much rules out using multiprocessing in any kind of library.

multiprocessing in general seems a bit rough. It might have the interface of the thread interface, but there is just no simple way around the GIL.

For more complex parallelization problems I would also look at the subprocess module or some other libraries (like mpi4py or Parallel Python).

like image 39
nikow Avatar answered Sep 25 '22 04:09

nikow