Based on this question I assumed that creating new process should be almost as fast as creating new thread in Linux. However, little test showed very different result. Here's my code:
from multiprocessing import Process, Pool
from threading import Thread
times = 1000
def inc(a):
b = 1
return a + b
def processes():
for i in xrange(times):
p = Process(target=inc, args=(i, ))
p.start()
p.join()
def threads():
for i in xrange(times):
t = Thread(target=inc, args=(i, ))
t.start()
t.join()
Tests:
>>> timeit processes()
1 loops, best of 3: 3.8 s per loop
>>> timeit threads()
10 loops, best of 3: 98.6 ms per loop
So, processes are almost 40 times slower to create! Why does it happen? Is it specific to Python or these libraries? Or did I just misinterpreted the answer above?
UPD 1. To make it more clear. I understand that this piece of code doesn't actually introduce any concurrency. The goal here is to test the time needed to create a process and a thread. To use real concurrency with Python one can use something like this:
def pools():
pool = Pool(10)
pool.map(inc, xrange(times))
which really runs much faster than threaded version.
UPD 2. I have added version with os.fork()
:
for i in xrange(times):
child_pid = os.fork()
if child_pid:
os.waitpid(child_pid, 0)
else:
exit(-1)
Results are:
$ time python test_fork.py
real 0m3.919s
user 0m0.040s
sys 0m0.208s
$ time python test_multiprocessing.py
real 0m1.088s
user 0m0.128s
sys 0m0.292s
$ time python test_threadings.py
real 0m0.134s
user 0m0.112s
sys 0m0.048s
The question you linked to is comparing the cost of just calling fork(2)
vs. pthread_create(3)
, whereas your code does quite a bit more, e.g. using join()
to wait for the processes/threads to terminate.
If, as you say...
The goal here is to test the time needed to create a process and a thread.
...then you shouldn't be waiting for them to complete. You should be using test programs more like these...
fork.py
import os
import time
def main():
for i in range(100):
pid = os.fork()
if pid:
#print 'created new process %d' % pid
continue
else:
time.sleep(1)
return
if __name__ == '__main__':
main()
thread.py
import thread
import time
def dummy():
time.sleep(1)
def main():
for i in range(100):
tid = thread.start_new_thread(dummy, ())
#print 'created new thread %d' % tid
if __name__ == '__main__':
main()
...which give the following results...
$ time python fork.py
real 0m0.035s
user 0m0.008s
sys 0m0.024s
$ time python thread.py
real 0m0.032s
user 0m0.012s
sys 0m0.024s
...so there's not much difference in the creation time of threads and processes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With