Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I use thread local storage for variables that only exist in a {class,method}?

I am implementing a relatively simple thread pool with Python's Queue.Queue class. I have one producer class that contains the Queue instance along with some convenience methods, along with a consumer class that subclasses threading.Thread. I instantiate that object for every thread I want in my pool ("worker threads," I think they're called) based on an integer.

Each worker thread takes flag, data off the queue, processes it using its own database connection, and places the GUID of the row onto a list so that the producer class knows when a job is done.

While I'm aware that other modules implement the functionality I'm coding, the reason I'm coding this is to gain a better understanding of how Python threading works. This brings me to my question.

If I store anything in a function's namespace or in the class's __dict__ object, will it be thread safe?

class Consumer(threading.Thread):
    def __init__(self, producer, db_filename):
        self.producer = producer
        self.conn = sqlite3.connect(db_filename)  # Is this var thread safe?
    def run(self):
        flag, data = self.producer.queue.get()

        while flag != 'stop':
            # Do stuff with data; Is `data` thread safe?

I am thinking that both would be thread safe, here's my rationale:

  • Each time a class is instantiated, a new __dict__ gets created. Under the scenario I outline above, I don't think any other object would have a reference to this object. (Now, perhaps the situation might get more complicated if I used join() functionality, but I'm not...)
  • Each time a function gets called, it creates its own name space which exists for the lifetime of the function. I'm not making any of my variables global, so I don't understand how any other object would have a reference to a function variable.

This post addresses my question somewhat, but is still a little abstract for me.

Thanks in advance for clearing this up for me.

like image 968
Sean Woods Avatar asked Mar 01 '23 12:03

Sean Woods


2 Answers

You are right; this is thread-safe. Local variables (the ones you call "function namespace") are always thread-safe, since only the thread executing the function can access them. Instance attributes are thread-safe as long as the instance is not shared across threads. As the consumer class inherits from Thread, its instances certainly won't be shared across threads.

The only "risk" here is the value of the data object: in theory, the producer might hold onto the data object after putting it into the queue, and (if the data object itself is mutable - make sure you understand what "mutable" means) may change the object while the Consumer is using it. If the producer leaves the data object alone after putting it into the queue, this is thread-safe.

like image 155
Martin v. Löwis Avatar answered Mar 03 '23 04:03

Martin v. Löwis


To make the data thread safe use copy.deepcopy() to create a new copy of the data before putting it on the queue. Then the producer can modify the data in the next loop without modifying the consumers copy before he gets to it.

like image 42
Kenneth Loafman Avatar answered Mar 03 '23 02:03

Kenneth Loafman