Are there some cases where Python threads can safely manipulate shared state?

Tags:

Some discussion in another question has encouraged me to to better understand cases where locking is required in multithreaded Python programs.

Per this article on threading in Python, I have several solid, testable examples of pitfalls that can occur when multiple threads access shared state. The example race condition provided on this page involves races between threads reading and manipulating a shared variable stored in a dictionary. I think the case for a race here is very obvious, and fortunately is eminently testable.

However, I have been unable to evoke a race condition with atomic operations such as list appends or variable increments. This test exhaustively attempts to demonstrate such a race:

from threading import Thread, Lock
import operator

def contains_all_ints(l, n):
    l.sort()
    for i in xrange(0, n):
        if l[i] != i:
            return False
    return True

def test(ntests):
    results = []
    threads = []
    def lockless_append(i):
        results.append(i)
    for i in xrange(0, ntests):
        threads.append(Thread(target=lockless_append, args=(i,)))
        threads[i].start()
    for i in xrange(0, ntests):
        threads[i].join()
    if len(results) != ntests or not contains_all_ints(results, ntests):
        return False
    else:
        return True

for i in range(0,100):
    if test(100000):
        print "OK", i
    else:
        print "appending to a list without locks *is* unsafe"
        exit()

I have run the test above without failure (100x 100k multithreaded appends). Can anyone get it to fail? Is there another class of object which can be made to misbehave via atomic, incremental, modification by threads?

Do these implicitly 'atomic' semantics apply to other operations in Python? Is this directly related to the GIL?

814

asked Apr 29 '10 20:04

Erik Garrison

1 Answers

Appending to a list is thread-safe, yes. You can only append to a list while holding the GIL, and the list takes care not to release the GIL during the append operation (which is, after all, a fairly simple operation.) The order in which different thread's append operations go through is of course up for grabs, but they will all be strictly serialized operations because the GIL is never released during an append.

The same is not necessarily true for other operations. Lots of operations in Python can cause arbitrary Python code to be executed, which in turn can cause the GIL to be released. For example, i += 1 is three distinct operations, "get i', "add 1 to it" and "store it in i". "add 1 to it" would translate (in this case) into it.__iadd__(1), which can go off and do whatever it likes.

Python objects themselves guard their own internal state -- dicts won't get corrupted by two different threads trying to set items in them. But if the data in the dict is supposed to be internally consistent, neither the dict nor the GIL does anything to protect that, except (in usual thread fashion) by making it less likely but still possible things end up different than you thought.

120

answered Nov 15 '22 01:11

Thomas Wouters

Related questions
                            
                                ImageFont's getsize() does not get correct text size?
                            
                                Got Django and Buildout working, but what about PIL and Postgres?
                            
                                Why scipy.io.wavfile.read does not return a tuple?
                            
                                Which technology is preferable to build a web based GUI Client? [closed]
                            
                                How can I use BeautifulSoup to find all the links in a page pointing to a specific domain?
                            
                                Multiple consumers & producers connected to a message queue, Is that possible in AMQP?
                            
                                Django and Postgres transaction rollback
                            
                                Why can't I do a hyphen in Django template view?
                            
                                Running script on server start in google app engine, in Python
                            
                                Different 404 pages depending on the application in Django
                            
                                customize the django admin panel?
                            
                                Best seed for parallel process
                            
                                PyYAML parse into arbitary object
                            
                                ctypes and pointer manipulation
                            
                                Python: How to transfer varrying length arrays over a network connection
                            
                                Python: why does str() on some text from a UTF-8 file give a UnicodeDecodeError?
                            
                                How do I most efficienty check the unique elements in a list?
                            
                                Configuration problems with django and mod_wsgi
                            
                                Error using httlib's HTTPSConnection with PKCS#12 certificate
                            
                                Why does Fabric display the disconnect from server message for almost 2 minutes?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are there some cases where Python threads can safely manipulate shared state?

Tags:

python

multithreading

gil

Erik Garrison

People also ask

1 Answers

Thomas Wouters

Recent Activity

Donate For Us