Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are locks unnecessary in multi-threaded Python code because of the GIL?

If you are relying on an implementation of Python that has a Global Interpreter Lock (i.e. CPython) and writing multithreaded code, do you really need locks at all?

If the GIL doesn't allow multiple instructions to be executed in parallel, wouldn't shared data be unnecessary to protect?

sorry if this is a dumb question, but it is something I have always wondered about Python on multi-processor/core machines.

same thing would apply to any other language implementation that has a GIL.

like image 399
Corey Goldberg Avatar asked Sep 19 '08 20:09

Corey Goldberg


People also ask

Does GIL make Python thread safe?

To prevent inconsistent changes, these C extensions required a thread-safe memory management which the GIL provided. The GIL is simple to implement and was easily added to Python. It provides a performance increase to single-threaded programs as only one lock needs to be managed.

Do you need locks in Python?

At any moment, yes, only one thread is executing Python code (other threads may be executing some IO, NumPy, whatever). That is mostly true. However, this is trivially true on any single-processor system, and yet people still need locks on single-processor systems.

What's the point of multithreading in Python if the GIL exists?

In the case of CPU-bound programs, multi-threading can save huge time and resources. If you have multiple CPU cores, you can execute each thread using separate cores and take advantage. But, GIL stops all this. Python threads cannot be run in parallel on multiple CPU cores due to the global interpreter lock (GIL).

Why are locks needed in a multithreaded program?

Any time you are going to read or write data, you need to lock it. This prevents data from attempting to read data that isn't done being written yet. Another way to word this is any data that is shared between threads or processes should be locked before altering or reading.


2 Answers

No - the GIL just protects python internals from multiple threads altering their state. This is a very low-level of locking, sufficient only to keep python's own structures in a consistent state. It doesn't cover the application level locking you'll need to do to cover thread safety in your own code.

The essence of locking is to ensure that a particular block of code is only executed by one thread. The GIL enforces this for blocks the size of a single bytecode, but usually you want the lock to span a larger block of code than this.

like image 35
Brian Avatar answered Sep 21 '22 05:09

Brian


You will still need locks if you share state between threads. The GIL only protects the interpreter internally. You can still have inconsistent updates in your own code.

For example:

#!/usr/bin/env python import threading  shared_balance = 0  class Deposit(threading.Thread):     def run(self):         for _ in xrange(1000000):             global shared_balance             balance = shared_balance             balance += 100             shared_balance = balance  class Withdraw(threading.Thread):     def run(self):         for _ in xrange(1000000):             global shared_balance             balance = shared_balance             balance -= 100             shared_balance = balance  threads = [Deposit(), Withdraw()]  for thread in threads:     thread.start()  for thread in threads:     thread.join()  print shared_balance 

Here, your code can be interrupted between reading the shared state (balance = shared_balance) and writing the changed result back (shared_balance = balance), causing a lost update. The result is a random value for the shared state.

To make the updates consistent, run methods would need to lock the shared state around the read-modify-write sections (inside the loops) or have some way to detect when the shared state had changed since it was read.

like image 174
Will Harris Avatar answered Sep 20 '22 05:09

Will Harris