The asyncio docs read:
Most asyncio objects are not thread safe. You should only worry if you access objects outside the event loop.
Could someone explain this or give an example of how misuse of asyncio can cause an unsynchronized write to an object shared between threads? I thought the GIL meant that only one thread can run the interpreter at a time and so events that happen in the interpreter, like reading and writing Python objects, are trivially synchronized between threads.
The second sentence in the quote above sounds like a clue but I'm not sure what to make of it.
I guess a thread could always cause havoc by releasing the GIL and deciding to write to Python objects anyway but that isn't specific to asyncio so I don't think that's what the docs are referring to here.
Is this maybe a matter of the asyncio PEPs reserving the option for certain asyncio objects to not be thread safe even though at the moment the implementation in CPython just so happens to be thread safe?
Actually, no, each thread is exactly that, a new thread of the interpreter.
It is a real thread managed by OS, not internally managed thread just for Python code within the Python Virtual Machine.
GIL is needed to prevent very OS-based threading from messing up Python objects.
Imagine one thread on one CPU and another on the other. Pure parallel threads, written in assembly. Both at the same time trying to change a registry value. Not desirable circumstance at all. Assembly instructions to access the same memory position will end up scrambling on what to move where and when. In the end the result of such an action may easily lead to segmentation fault. Well, if we write in C, C controls that part, so that this doesn't happen in C code. GIL does the same for Python code on C level. So that code implementing Python objects doesn't loose its atomicity when changing them. Imagine a thread inserting a value to a list that is being just shifted down in another thread because that other thread removed some elements from it. Without a GIL this would crash.
GIL does nothing about atomicity of your code within the threads. It is just for internal memory management.
Even if you have thread safe objects like deque(), if you are doing more than one operation at once on it, without additional lock, you can get result from another thread inserted somewhere in between. And whoops, problem occurs!
Let say one thread takes an object from a stack, checks something about it, and if condition is right removes it.
stack = [2,3,4,5,6,7,8]
def thread1 ():
while 1:
v = stack[0]
sleep(0.001)
if v%2==0: del stack[0]
sleep(0.001)
Of course, this is stupid and should be done with stack.pop(0) to avoid this. But this is an example.
And let have another thread that adds to the stack each 0.002 seconds:
def thread2 ():
while 1:
stack.insert(0, stack[-1]+1)
sleep(0.002)
Now if you do:
thread(thread2,())
sleep(1)
thread(thread1,())
There will be a moment, although unlikely, where thread2() tries to stack up new item exactly in between thread1()'s retrieval and deletion. So, thread1() will remove a newly added item instead of the one being checked. The result doesn't comply with our wishes. So, GIL doesn't control what we are doing in our threads, just what threads are doing to each-other on more basic sense.
Imagine you wrote a server for buying tickets for some event. Two users connect and try to buy the same ticket at the same time. If you are not careful, users may end sitting one on top of other.
Thread-safe object is an object that performs the action and it doesn't allow another action to take place until the first one is completed.
For instance, if you are iterating over deque() in one thread, and in middle of it another thread tries to append something, append() will block until the first thread is done iterating over it. This is thread-safe.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With