Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A multi-threading example of the python GIL

I've read a quite a bit about how "bad" this python GIL business is when writing multi-threaded code, but I've never seen an example. Could someone please give me a basic example of when the GIL causes problems when using threading.

Thanks!

like image 221
vgoklani Avatar asked Jul 18 '17 04:07

vgoklani


1 Answers

One of the main reasons for multithreading is so that a program can take advantage of multiple CPUs (and/or multiple cores on a CPU) in order to compute more operations per second. But in Python, the GIL means that even if you have multiple threads working simultaneously on a computation, only one of those threads will actually be running at any given instant, because all of the other ones will be blocked, waiting to acquire the global interpreter lock. That means that a multithreaded version of a Python program will actually be slower than the single-threaded version, rather than faster, since only one thread runs at a time -- plus there is the accounting overhead incurred by forcing every thread to wait for, acquire, and then relinquish the GIL (round-robin style) every few milliseconds.

To demonstrate this, here is a toy Python script that spawns a specified number of threads, and then as its "computation" each thread just continually increments a counter until 5 seconds have passed. At the end, the main thread tallies up the total number of counter-increments that occurred and prints the total, to give us a measurement of how much "work" was done during the 5-second period.

import threading
import sys
import time

numSecondsToRun = 5

class CounterThread(threading.Thread):
   def __init__(self):
      threading.Thread.__init__(self)
      self._counter = 0
      self._endTime = time.time() + numSecondsToRun

   def run(self):
      # Simulate a computation on the CPU
      while(time.time() < self._endTime):
         self._counter += 1

if __name__ == "__main__":
   if len(sys.argv) < 2:
      print "Usage:  python counter 5"
      sys.exit(5)

   numThreads = int(sys.argv[1])
   print "Spawning %i counting threads for %i seconds..." % (numThreads, numSecondsToRun)

   threads = []
   for i in range(0,numThreads):
      t = CounterThread()
      t.start()
      threads.append(t)

   totalCounted = 0
   for t in threads:
      t.join()
      totalCounted += t._counter
   print "Total amount counted was %i" % totalCounted

.... and here are the results I get on my computer (which is a dual-core Mac Mini with hyper-threading enabled, FWIW):

$ python counter.py 1
Spawning 1 counting threads for 5 seconds...
Total amount counted was 14210740

$ python counter.py 2
Spawning 2 counting threads for 5 seconds...
Total amount counted was 10398956

$ python counter.py 3
Spawning 3 counting threads for 5 seconds...
Total amount counted was 10588091

$ python counter.py 4
Spawning 4 counting threads for 5 seconds...
Total amount counted was 11091197

$ python counter.py 5
Spawning 5 counting threads for 5 seconds...
Total amount counted was 11130036

$ python counter.py 6
Spawning 6 counting threads for 5 seconds...
Total amount counted was 10771654

$ python counter.py 7
Spawning 7 counting threads for 5 seconds...
Total amount counted was 10464226

Note how the best performance was achieved by the first iteration (where only a single worker thread was spawned); the counting-productivity dropped considerably when more than one thread was running at once. This shows how multithreading performance in Python is crippled by GIL -- the same program written in C (or any other language without a GIL) would show much better performance with more threads running, not worse (up until the number of worker threads matched the number of cores on the hardware, of course).

This doesn't mean that multithreading is completely useless in Python, though -- it's still useful in cases where most or all of your threads are blocked waiting for I/O rather than CPU-bound. That's because a Python thread that is blocked waiting for I/O doesn't hold the GIL locked while it waits, so during that time other threads are still free to execute. If you need to parallelize a computation-intensive task, though (e.g. ray tracing or computing all the digits of Pi or codebreaking or similar), then you'll want to either use multiple processes rather than multiple threads, or use a different language that doesn't have a GIL.

like image 198
Jeremy Friesner Avatar answered Oct 19 '22 17:10

Jeremy Friesner