I'd like to use an in-memory thread-local cache for a value from the database that isn't going to change during a request/response cycle, but gets called hundreds (potentially thousands) of times. My limited understanding is that using a "global"/module variable is one way to implement this type of cache. e.g.: <pre class="prettyprint"><code>#somefile.py foo = None def get_foo(request): global foo if not foo: foo = get_foo_from_db(request.blah) return foo </code></pre> I'm wondering whether using this type of "global" is thread-safe in python, and that therefore I can be comfortable that get_foo_from_db() will get called exactly once per request/response cycle in django (using either runserver or gunicorn+gevent). Is my understanding correct? This thing gets called enough that even using memcached to store the value is going to be a bottleneck (I'm profiling it as we speak).

No, access to globals is not thread-safe. Threads do not get their own copy of globals, globals are shared among threads. The code: <pre class="prettyprint"><code>if not foo: foo = get_foo_from_db(request.blah) </code></pre> compiles to several python bytecode statements: <pre class="prettyprint"><code> 2 0 LOAD_FAST 1 (foo) 3 POP_JUMP_IF_TRUE 24 3 6 LOAD_GLOBAL 0 (get_foo_from_db) 9 LOAD_FAST 0 (request) 12 LOAD_ATTR 1 (blah) 15 CALL_FUNCTION 1 18 STORE_FAST 1 (foo) 21 JUMP_FORWARD 0 (to 24) </code></pre> A thread switch can occur after each and every bytecode execution, so another thread could alter <code>foo</code> after you tested it.

Are python "global" (module) variables thread local?

Tags:

python

django

I'd like to use an in-memory thread-local cache for a value from the database that isn't going to change during a request/response cycle, but gets called hundreds (potentially thousands) of times. My limited understanding is that using a "global"/module variable is one way to implement this type of cache.

e.g.:

#somefile.py

foo = None

def get_foo(request):
  global foo
  if not foo:
    foo = get_foo_from_db(request.blah)
  return foo

I'm wondering whether using this type of "global" is thread-safe in python, and that therefore I can be comfortable that get_foo_from_db() will get called exactly once per request/response cycle in django (using either runserver or gunicorn+gevent). Is my understanding correct? This thing gets called enough that even using memcached to store the value is going to be a bottleneck (I'm profiling it as we speak).

885

asked Mar 12 '13 15:03

B Robster

2 Answers

No, access to globals is not thread-safe. Threads do not get their own copy of globals, globals are shared among threads.

The code:

if not foo:
    foo = get_foo_from_db(request.blah)

compiles to several python bytecode statements:

  2           0 LOAD_FAST                1 (foo)
              3 POP_JUMP_IF_TRUE        24

  3           6 LOAD_GLOBAL              0 (get_foo_from_db)
              9 LOAD_FAST                0 (request)
             12 LOAD_ATTR                1 (blah)
             15 CALL_FUNCTION            1
             18 STORE_FAST               1 (foo)
             21 JUMP_FORWARD             0 (to 24)

A thread switch can occur after each and every bytecode execution, so another thread could alter foo after you tested it.

195

answered Nov 01 '22 13:11

Martijn Pieters

No, you are wrong on two counts.

Firstly, the use of "threads" is a bit vague here. Depending on how its server is configured, Django can be served either using threads or processes or both (see the mod_wsgi documentation for a full discussion). If there is a single thread per process, then you can can guarantee that only one instance of a module will be available to each process. But that is highly dependent on that configuration.

Even so, it is still not the case that there will be "exactly one" call to that function per request/response cycle. This is because the lifetime of a process is entirely unrelated to that cycle. A process will last for multiple requests, so that variable will persist for all of those requests.

answered Nov 01 '22 12:11

Daniel Roseman

Related questions
                            
                                Read Celery configuration from Python properties file
                            
                                recv() in Python
                            
                                How can i write my custom link extractor in scrapy python
                            
                                Fabric Sudo No Password Solution
                            
                                How to find mtu value of network through code(in python)?
                            
                                Is there anything like Python export?
                            
                                How do I do a SQL style disjoint or set difference on two Pandas DataFrame objects?
                            
                                Picking up items progressivly as soon as a queue is available
                            
                                Python unicode string literals :: what's the difference between '\u0391' and u'\u0391'
                            
                                good merkle hash tree python implementation?
                            
                                How to get multiple parameters with same name from a URL in Pylons?
                            
                                Converting postgresql timestamp to JavaScript timestamp in Python
                            
                                Analogue of Python's OrderedDict?
                            
                                Correct usage of os.path and os.join
                            
                                How to do nonlinear complex root finding in Python
                            
                                How to parse html table with python and beautifulsoup and write to csv
                            
                                Detect if text in English with python [closed]
                            
                                Numpy Array Broadcasting with different dimensions
                            
                                Cython: unsigned int indices for numpy arrays gives different result
                            
                                How to deploy Flask+ Python application on Windows Azure?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With