Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Do I have to worry about Thread Safety in CPython?

From what I understand, the Global Interpreter Lock allows only a single thread to access the interpreter and execute bytecode. If that's the case, then at any given time, only a single thread will be using the interpreter and its memory.

With that I believe that it is fair to exclude the possibility of having race cases, since no two threads can access the interpreter's memory at the same time, yet I still see warnings about making sure data structures are "thread safe". There is a possibility that it may be covering all implementations of the python interpreter (like cython) which can switch off the GIL and allow true multi threading.

I understand the importance of thread safety in interpreter environments that do not have the GIL enabled. However, for CPython, why is thread safety encouraged when writing multi threaded python code? What is the worse that can happen in the CPython environment?

like image 536
Anfernee Avatar asked Aug 29 '16 12:08

Anfernee


People also ask

When should I worry about thread safety?

Save this answer. Show activity on this post. Thread safety becomes a concern if there is at least a single entry point which can be accessed by multiple threads. If a piece of code is accessed by multiple threads and is calling other method/class/etc., then all this code tree becomes vulnerable.

Why do we need thread safety?

It means when multiple threads executing simultaneously, and want to access the same resource at the same time, then the problem of inconsistency will occur. so synchronization is used to resolve inconsistency problem by allowing only one thread at a time.

How many threads is safe in Python?

Generally, Python only uses one thread to execute the set of written statements. This means that in python only one thread will be executed at a time.

Is threading important in Python?

Python threading allows you to have different parts of your program run concurrently and can simplify your design. If you've got some experience in Python and want to speed up your program using threads, then this tutorial is for you!


1 Answers

Of course race conditions can still take place, because access to datastructures is not atomic.

Say you test for a key being present in a dictionary, then do something to add the key:

if key not in dictionary:
    # calculate new value
    value = elaborate_calculation()
    dictionary[key] = value

The thread can be switched at any point after the not in test has returned true, and another thread will also come to the conclusion that the key isn't there. Now two threads are doing the calculation, and you don't know which one will win.

All that the GIL does is protect Python's internal interpreter state. This doesn't mean that data structures used by Python code itself are now locked and protected.

like image 55
Martijn Pieters Avatar answered Oct 08 '22 10:10

Martijn Pieters