If you are relying on an implementation of Python that has a Global Interpreter Lock (i.e. CPython) and writing multithreaded code, do you really need locks at all?
If the GIL doesn't allow multiple instructions to be executed in parallel, wouldn't shared data be unnecessary to protect?
sorry if this is a dumb question, but it is something I have always wondered about Python on multi-processor/core machines.
same thing would apply to any other language implementation that has a GIL.
To prevent inconsistent changes, these C extensions required a thread-safe memory management which the GIL provided. The GIL is simple to implement and was easily added to Python. It provides a performance increase to single-threaded programs as only one lock needs to be managed.
At any moment, yes, only one thread is executing Python code (other threads may be executing some IO, NumPy, whatever). That is mostly true. However, this is trivially true on any single-processor system, and yet people still need locks on single-processor systems.
In the case of CPU-bound programs, multi-threading can save huge time and resources. If you have multiple CPU cores, you can execute each thread using separate cores and take advantage. But, GIL stops all this. Python threads cannot be run in parallel on multiple CPU cores due to the global interpreter lock (GIL).
Any time you are going to read or write data, you need to lock it. This prevents data from attempting to read data that isn't done being written yet. Another way to word this is any data that is shared between threads or processes should be locked before altering or reading.
No - the GIL just protects python internals from multiple threads altering their state. This is a very low-level of locking, sufficient only to keep python's own structures in a consistent state. It doesn't cover the application level locking you'll need to do to cover thread safety in your own code.
The essence of locking is to ensure that a particular block of code is only executed by one thread. The GIL enforces this for blocks the size of a single bytecode, but usually you want the lock to span a larger block of code than this.
You will still need locks if you share state between threads. The GIL only protects the interpreter internally. You can still have inconsistent updates in your own code.
For example:
#!/usr/bin/env python import threading shared_balance = 0 class Deposit(threading.Thread): def run(self): for _ in xrange(1000000): global shared_balance balance = shared_balance balance += 100 shared_balance = balance class Withdraw(threading.Thread): def run(self): for _ in xrange(1000000): global shared_balance balance = shared_balance balance -= 100 shared_balance = balance threads = [Deposit(), Withdraw()] for thread in threads: thread.start() for thread in threads: thread.join() print shared_balance
Here, your code can be interrupted between reading the shared state (balance = shared_balance
) and writing the changed result back (shared_balance = balance
), causing a lost update. The result is a random value for the shared state.
To make the updates consistent, run methods would need to lock the shared state around the read-modify-write sections (inside the loops) or have some way to detect when the shared state had changed since it was read.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With