What is "thread local storage" in Python, and why do I need it?

People also ask

Why do we need thread-local storage?

We need thread-local storage to create libraries that have thread-safe functions, because of the thread-local storage each call to a function has its copy of the same global data, so it's safe I like to point out that the implementation is the same for copy on write technique.

Why do we need threads in Python?

Python threading allows you to have different parts of your program run concurrently and can simplify your design. If you've got some experience in Python and want to speed up your program using threads, then this tutorial is for you!

What is ThreadLocal Python?

Thread-local data is data whose values are thread specific. To manage thread-local data, just create an instance of local (or a subclass) and store attributes on it: mydata = threading.local() mydata.x = 1. The instance's values will be different for separate threads.

How does thread-local storage work?

With thread local storage (TLS), you can provide unique data for each thread that the process can access using a global index. One thread allocates the index, which can be used by the other threads to retrieve the unique data associated with the index.

In Python, everything is shared, except for function-local variables (because each function call gets its own set of locals, and threads are always separate function calls.) And even then, only the variables themselves (the names that refer to objects) are local to the function; objects themselves are always global, and anything can refer to them. The Thread object for a particular thread is not a special object in this regard. If you store the Thread object somewhere all threads can access (like a global variable) then all threads can access that one Thread object. If you want to atomically modify anything that another thread has access to, you have to protect it with a lock. And all threads must of course share this very same lock, or it wouldn't be very effective.

If you want actual thread-local storage, that's where threading.local comes in. Attributes of threading.local are not shared between threads; each thread sees only the attributes it itself placed in there. If you're curious about its implementation, the source is in _threading_local.py in the standard library.

Consider the following code:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread, local

data = local()

def bar():
    print("I'm called from", data.v)

def foo():
    bar()

class T(Thread):
    def run(self):
        sleep(random())
        data.v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()

 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-1

Here threading.local() is used as a quick and dirty way to pass some data from run() to bar() without changing the interface of foo().

Note that using global variables won't do the trick:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread

def bar():
    global v
    print("I'm called from", v)

def foo():
    bar()

class T(Thread):
    def run(self):
        global v
        sleep(random())
        v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()

 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-2

Meanwhile, if you could afford passing this data through as an argument of foo() - it would be a more elegant and well-designed way:

from threading import Thread

def bar(v):
    print("I'm called from", v)

def foo(v):
    bar(v)

class T(Thread):
    def run(self):
        foo(self.getName())

But this is not always possible when using third-party or poorly designed code.

You can create thread local storage using threading.local().

>>> tls = threading.local()
>>> tls.x = 4 
>>> tls.x
4

Data stored to the tls will be unique to each thread which will help ensure that unintentional sharing does not occur.

Just like in every other language, every thread in Python has access to the same variables. There's no distinction between the 'main thread' and child threads.

One difference with Python is that the Global Interpreter Lock means that only one thread can be running Python code at a time. This isn't much help when it comes to synchronising access, however, as all the usual pre-emption issues still apply, and you have to use threading primitives just like in other languages. It does mean you need to reconsider if you were using threads for performance, however.

I may be wrong here. If you know otherwise please expound as this would help explain why one would need to use thread local().

This statement seems off, not wrong: "If you want to atomically modify anything that another thread has access to, you have to protect it with a lock." I think this statement is ->effectively<- right but not entirely accurate. I thought the term "atomic" meant that the Python interpreter created a byte-code chunk that left no room for an interrupt signal to the CPU.

I thought atomic operations are chunks of Python byte code that does not give access to interrupts. Python statements like "running = True" is atomic. You do not need to lock CPU from interrupts in this case (I believe). The Python byte code breakdown is safe from thread interruption.

Python code like "threads_running[5] = True" is not atomic. There are two chunks of Python byte code here; one to de-reference the list() for an object and another byte code chunk to assign a value to an object, in this case a "place" in a list. An interrupt can be raised -->between<- the two byte-code ->chunks<-. That is were bad stuff happens.

How does thread local() relate to "atomic"? This is why the statement seems misdirecting to me. If not can you explain?

Related questions
                            
                                How do I get the user agent with Flask?
                            
                                Invalid syntax when using "print"? [duplicate]
                            
                                Why in Python does "0, 0 == (0, 0)" equal "(0, False)"?
                            
                                How to split/partition a dataset into training and test datasets for, e.g., cross validation?
                            
                                Split a string at uppercase letters
                            
                                How can strings be concatenated?
                            
                                UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 2: ordinal not in range(128)
                            
                                Output first 100 characters in a string
                            
                                How to find the size or shape of a DataFrame in PySpark?
                            
                                Conda command not found
                            
                                Resize fields in Django Admin
                            
                                Python returns MagicMock object instead of return_value
                            
                                ValueError: math domain error
                            
                                Decorator execution order
                            
                                Is it Pythonic to use list comprehensions for just side effects?
                            
                                Python __str__ and lists
                            
                                why is plotting with Matplotlib so slow?
                            
                                Python coding standards/best practices [closed]
                            
                                Inline labels in Matplotlib
                            
                                Convert RGBA PNG to RGB with PIL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is "thread local storage" in Python, and why do I need it?

Tags:

python

multithreading

thread-local

People also ask

Recent Activity

Donate For Us