Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are python "global" (module) variables thread local?

Tags:

python

django

I'd like to use an in-memory thread-local cache for a value from the database that isn't going to change during a request/response cycle, but gets called hundreds (potentially thousands) of times. My limited understanding is that using a "global"/module variable is one way to implement this type of cache.

e.g.:

#somefile.py

foo = None

def get_foo(request):
  global foo
  if not foo:
    foo = get_foo_from_db(request.blah)
  return foo

I'm wondering whether using this type of "global" is thread-safe in python, and that therefore I can be comfortable that get_foo_from_db() will get called exactly once per request/response cycle in django (using either runserver or gunicorn+gevent). Is my understanding correct? This thing gets called enough that even using memcached to store the value is going to be a bottleneck (I'm profiling it as we speak).

like image 885
B Robster Avatar asked Mar 12 '13 15:03

B Robster


People also ask

Do Python threads share global variables?

Multiple threads may be able to access the global variable directly, as described above. We can protect the global variable from race conditions by using a mutual exclusion lock via the threading. Lock class. First, we can create a lock at the same global scope as the global variable.

Are global variables shared between threads?

Threads share all global variables; the memory space where global variables are stored is shared by all threads (though, as we will see, you have to be very careful about accessing a global variable from multiple threads). This includes class-static members!

Are Python variables global or local?

Normally, when you create a variable inside a function, that variable is local, and can only be used inside that function. To create a global variable inside a function, you can use the global keyword.

Are Python global variables thread safe?

Local variables and parameters are always thread-safe. Instance variables, class variables, and global variables may not be thread-safe (but they might be). Nevertheless, threads and shared variables can be useful.


2 Answers

No, access to globals is not thread-safe. Threads do not get their own copy of globals, globals are shared among threads.

The code:

if not foo:
    foo = get_foo_from_db(request.blah)

compiles to several python bytecode statements:

  2           0 LOAD_FAST                1 (foo)
              3 POP_JUMP_IF_TRUE        24

  3           6 LOAD_GLOBAL              0 (get_foo_from_db)
              9 LOAD_FAST                0 (request)
             12 LOAD_ATTR                1 (blah)
             15 CALL_FUNCTION            1
             18 STORE_FAST               1 (foo)
             21 JUMP_FORWARD             0 (to 24)

A thread switch can occur after each and every bytecode execution, so another thread could alter foo after you tested it.

like image 195
Martijn Pieters Avatar answered Nov 01 '22 13:11

Martijn Pieters


No, you are wrong on two counts.

Firstly, the use of "threads" is a bit vague here. Depending on how its server is configured, Django can be served either using threads or processes or both (see the mod_wsgi documentation for a full discussion). If there is a single thread per process, then you can can guarantee that only one instance of a module will be available to each process. But that is highly dependent on that configuration.

Even so, it is still not the case that there will be "exactly one" call to that function per request/response cycle. This is because the lifetime of a process is entirely unrelated to that cycle. A process will last for multiple requests, so that variable will persist for all of those requests.

like image 42
Daniel Roseman Avatar answered Nov 01 '22 12:11

Daniel Roseman