I'd like to use an in-memory thread-local cache for a value from the database that isn't going to change during a request/response cycle, but gets called hundreds (potentially thousands) of times. My limited understanding is that using a "global"/module variable is one way to implement this type of cache.
e.g.:
#somefile.py
foo = None
def get_foo(request):
global foo
if not foo:
foo = get_foo_from_db(request.blah)
return foo
I'm wondering whether using this type of "global" is thread-safe in python, and that therefore I can be comfortable that get_foo_from_db() will get called exactly once per request/response cycle in django (using either runserver or gunicorn+gevent). Is my understanding correct? This thing gets called enough that even using memcached to store the value is going to be a bottleneck (I'm profiling it as we speak).
Multiple threads may be able to access the global variable directly, as described above. We can protect the global variable from race conditions by using a mutual exclusion lock via the threading. Lock class. First, we can create a lock at the same global scope as the global variable.
Threads share all global variables; the memory space where global variables are stored is shared by all threads (though, as we will see, you have to be very careful about accessing a global variable from multiple threads). This includes class-static members!
Normally, when you create a variable inside a function, that variable is local, and can only be used inside that function. To create a global variable inside a function, you can use the global keyword.
Local variables and parameters are always thread-safe. Instance variables, class variables, and global variables may not be thread-safe (but they might be). Nevertheless, threads and shared variables can be useful.
No, access to globals is not thread-safe. Threads do not get their own copy of globals, globals are shared among threads.
The code:
if not foo:
foo = get_foo_from_db(request.blah)
compiles to several python bytecode statements:
2 0 LOAD_FAST 1 (foo)
3 POP_JUMP_IF_TRUE 24
3 6 LOAD_GLOBAL 0 (get_foo_from_db)
9 LOAD_FAST 0 (request)
12 LOAD_ATTR 1 (blah)
15 CALL_FUNCTION 1
18 STORE_FAST 1 (foo)
21 JUMP_FORWARD 0 (to 24)
A thread switch can occur after each and every bytecode execution, so another thread could alter foo
after you tested it.
No, you are wrong on two counts.
Firstly, the use of "threads" is a bit vague here. Depending on how its server is configured, Django can be served either using threads or processes or both (see the mod_wsgi documentation for a full discussion). If there is a single thread per process, then you can can guarantee that only one instance of a module will be available to each process. But that is highly dependent on that configuration.
Even so, it is still not the case that there will be "exactly one" call to that function per request/response cycle. This is because the lifetime of a process is entirely unrelated to that cycle. A process will last for multiple requests, so that variable will persist for all of those requests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With