I was working the following example from Doug Hellmann tutorial on multiprocessing:
import multiprocessing def worker(): """worker function""" print 'Worker' return if __name__ == '__main__': jobs = [] for i in range(5): p = multiprocessing.Process(target=worker) jobs.append(p) p.start()
When I tried to run it outside the if statement:
import multiprocessing def worker(): """worker function""" print 'Worker' return jobs = [] for i in range(5): p = multiprocessing.Process(target=worker) jobs.append(p) p.start()
It started spawning processes non-stop, and the only way to stop it was reboot!
Why would that happen? Why it did not generate 5 processes and exit? Why do I need the if statement?
On Windows there is no fork()
routine, so multiprocessing
imports the current module to get access to the worker
function. Without the if
statement the child process starts its own children and so on.
Note that the documentation mentions that you need the if
statement on windows (here).
However, the documentation doesn't say that this kills your machine almost instantly, requiring a reboot. So this can be quite confusing, especially if the use of multiprocessing
happens in some function deep inside the code. No matter how deeply hidden it is, you still need the if
check in the main program file. This pretty much rules out using multiprocessing
in any kind of library.
multiprocessing
in general seems a bit rough. It might have the interface of the thread interface, but there is just no simple way around the GIL.
For more complex parallelization problems I would also look at the subprocess
module or some other libraries (like mpi4py or Parallel Python).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With