import time
from multiprocessing import Process
def loop(limit):
for i in xrange(limit):
pass
print i
limit = 100000000 #100 million
start = time.time()
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
p.join()
end = time.time()
print end - start
I tried running this code, this is the output I am getting
99999999
99999999
2.73401999474
99999999
99999999
99999999
and sometimes
99999999
99999999
3.72434902191
99999999
99999999
99999999
99999999
99999999
In this case the loop function is called 7 times instead of 5. Why this strange behaviour?
I am also confused about the role of the p.join() statement. Is it ending any one process or all of them at the same time?
The join function currently will wait for the last process you call to finish before moving onto the next section of code. If you walk through what you have done you should see why you get the "strange" output.
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
This starts 5 new processes one after the other. These are all running at the same time. Just about at least, it is down to the scheduler to decide what process is currently being processed.
This mean you have 5 processes running now:
Process 1
Process 2
Process 3
Process 4
Process 5
p.join()
This is going to wait for p process to finish Process 5 as that was the last process to be assigned to p.
Lets now say that Process 2 finishes first followed by Process 5, which is perfectly feasible as the scheduler could give those processes more time on the CPU.
Process 1
Process 2 prints 99999999
Process 3
Process 4
Process 5 prints 99999999
The p.join() line will now move on to the next part as p Process 5 has finished.
end = time.time()
print end - start
This section prints its part and now there are 3 Processes still going on after this output.
The other Processes finish and print there 99999999.
To fix this behaviour you will need to .join() all the processes. To do this you could alter your code to this...
processes = []
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
processes.append(p)
for process in processes:
process.join()
This will wait for the first process, then the second and so on. It won't matter if one process finished before anther because every process on the list must be waited on before the script continues.
There are some problems with the way you are doing things, try this:
start = time.time()
procs = []
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
procs.append(p)
[p.join() for p in procs]
The problem is that you are not tracking of individual processes (p variables inside the loop). You need to keep them around so you can interact with them. This update will keep them in the array and then join all of them at the end.
Output looks like this:
99999999
99999999
99999999
99999999
99999999
6.29328012466
Note that now the time it took to run is also printed at the end of the execution.
Also, I ran your code and was not able to get the loop to execute multiple times.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With