Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to achive true parallelism with thread in Python?

I'm learning about threading library in Python. I don't understand, how to run two threads in parallel?

Here are my python programs:

Program without threading (fibsimple.py)

def fib(n):
    if n < 2:
        return n
    else: 
        return fib(n-1) + fib(n-2)

fib(35)
fib(35)

print "Done"

Running time:

$ time python fibsimple.py 
Done

real    0m7.935s
user    0m7.922s
sys 0m0.008s

Same program with threading(fibthread.py)

from threading import Thread
def fib(n):
    if n < 2:
        return n
    else: 
        return fib(n-1) + fib(n-2)

t1 = Thread(target = fib, args = (35, ))
t1.start()

t2 = Thread(target = fib, args = (35, ))
t2.start()

t1.join()
t2.join()

print "Done"

Running time:

$ time python fibthread.py 
Done

real    0m12.313s
user    0m10.894s
sys 0m5.043s

I don't understand why thread program is taking more time? It should be almost half, if threads are running in parallel.

But If I implement the same program with multiprocessing library, time will become half.

program with multiprocess(fibmultiprocess.py)

from multiprocessing import Process

def fib(n):
    if n < 2:
        return n
    else: 
        return fib(n-1) + fib(n-2)

p1 = Process(target = fib, args = (35, ))
p1.start()

p2 = Process(target = fib, args = (35, ))
p2.start()

p1.join()
p2.join()

print "Done"

Running time

 $ time python fibmultiporcess.py 
 Done

 real   0m4.303s
 user   0m8.065s
 sys    0m0.007s

Can someone explain, How to run threads in parallel? How multiprocessing and thread-parallelism are different? Any help would be appreciated.

like image 891
rishi kant Avatar asked Jul 08 '17 06:07

rishi kant


1 Answers

To explain the weird running time of multithread, you have to know GIL.

GIL stands for Global Interpreter Lock which intends to serialize access to interpreter internals from different threads. That is, only ONE thread is running by a interpreter at a time. On multi-core systems, it means that multiple threads can't effectively make use of multiple cores.

But why is the running time longer than the one without multithread?

That's because there're extra time consumed during switching between threads.

And of course, since using multiprocessing creates multiple interpreters, it is not affected by GIL. That's why the speed can double as expected.

Reference

Good comparison between multithread and multiprocess in python link

To know more about GIL and some other experiments, checkout Understanding the Python GIL - David Beazley. This is the best explanation you can have.

like image 175
YLJ Avatar answered Nov 07 '22 13:11

YLJ