Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Python provide locking mechanisms if it's subject to a GIL?

I'm aware that Python threads can only execute bytecode one at a time, so why would the threading library provide locks? I'm assuming race conditions can't occur if only one thread is executing at a time.

The library provides locks, conditions, and semaphores. Is the only purpose of this to synchronize execution?

Update:

I performed a small experiment:

from threading import Thread
from multiprocessing import Process

num = 0

def f():
    global num
    num += 1

def thread(func):
    # return Process(target=func)
    return Thread(target=func)


if __name__ == '__main__':
    t_list = []
    for i in xrange(1, 100000):
        t = thread(f)
        t.start()
        t_list.append(t)

    for t in t_list:
        t.join()

    print num

Basically I should have started 100k threads and incremented by 1. The result returned was 99993.

a) How can the result not be 99999 if there's a GIL syncing and avoiding race conditions? b) Is it even possible to start 100k OS threads?

Update 2, after seeing answers:

If the GIL doesn't really provide a way to perform a simple operation like incrementing atomically, what's the purpose of having it there? It doesn't help with nasty concurrency issues, so why was it put in place? I've heard use cases for C-extensions, can someone examplify this?

like image 597
danihodovic Avatar asked Nov 11 '14 20:11

danihodovic


People also ask

Why we use lock in Python?

Python Multithread The control is necessary to prevent corruption of data. In other words, to guard against simultaneous access to an object, we need to use a Lock object.

Are there locks in Python?

Lock Objects. A primitive lock is a synchronization primitive that is not owned by a particular thread when locked. In Python, it is currently the lowest level synchronization primitive available, implemented directly by the _thread extension module. A primitive lock is in one of two states, “locked” or “unlocked”.

Why do we need locks?

Locks are used to make a river more easily navigable, or to allow a canal to cross land that is not level. Later canals used more and larger locks to allow a more direct route to be taken.

Does Python use real threads if it uses a global interpreter lock describe with an example?

Python Global Interpreter Lock (GIL) is a type of process lock which is used by python whenever it deals with processes. Generally, Python only uses only one thread to execute the set of written statements. This means that in python only one thread will be executed at a time.


1 Answers

The GIL synchronizes bytecode operations. Only one byte code can execute at once. But if you have an operation that requires more than one bytecode, you could switch threads between the bytecodes. If you need the operation to be atomic, then you need synchronization above and beyond the GIL.

For example, incrementing an integer is not a single bytecode:

>>> def f():
...   global num
...   num += 1
...
>>> dis.dis(f)
  3           0 LOAD_GLOBAL              0 (num)
              3 LOAD_CONST               1 (1)
              6 INPLACE_ADD
              7 STORE_GLOBAL             0 (num)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE

Here it took four bytecodes to implement num += 1. The GIL will not ensure that x is incremented atomically. Your experiment demonstrates the problem: you have lost updates because the threads switched between the LOAD_GLOBAL and the STORE_GLOBAL.

The purpose of the GIL is to ensure that the reference counts on Python objects are incremented and decremented atomically. It isn't meant to help you with your own data structures.

like image 187
Ned Batchelder Avatar answered Oct 14 '22 05:10

Ned Batchelder