Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: Python global variables overlap, even for separate runs

I have a Django (so Python) program with a global variable:

g_variable = []

I use this is several functions where I also change the value:

my_function()
    global g_variable 
    g_variable.append(some_value)

That worked great until I started calling the program multiple overlapping times - in Django that means that I loaded the webpage multiple times quickly. I expected that the global variable would only be global within each individual run, but that is not the case. The values that are appended to g_variable in one run can be seen in the next run.

To me this means that I now have to pass this variable around to all my functions:

my_function(non_g_variable)
    non_g_variable.append(some_value)
    return non_g_variable

called with

non_g_variable = my_function(non_g_variable)

Is that correct? Before I change all my code I just want to make sure that I haven't missed something. It will add a lot of extra lines and return calls.

like image 375
user984003 Avatar asked May 22 '26 17:05

user984003


2 Answers

You should probably redesign your code to get rid of the global variable, as other answers and comments say. Something along the lines of:

class WebpageStructure(object):
    def __init__(self, html):
         # parse the html
         self.structure = self.parse(html)
    def contains_link(self):
         # figure it out using self.structure
         return ...

# in the view(s)
webpage = WebpageStructure(html_to_parse)
if webpage.contains_link():
    ...

There are however options:

  1. If your code always runs in a single thread you can fix the problem by setting g_variable to [] between each run. There is probably one top-level function (a Django view function perhaps?) that always marks the start of each run. You should re-initialize the g_variable in this top-level function.

  2. If your code runs multi-threaded, you cannot use a normal global variable. Concurrent threads will update the same global variable.

    Regarding 1 and 2: To run a Django site in a single thread, use manage.py runserver --nothreading for the development server. If you host your site in apache/mod_wsgi, you can control this using daemon mode. Note that you can run multiple single-threaded side-by-side processes. Using a global variable will work in that scenario, since the processes are isolated.

    If possible your code should work in any process/thread model.

  3. If your code runs multi-threaded and you really want to avoid passing around the g_variable list you can use thread-local variables. Documentation here and here.

Example:

import threading
threadlocal = threading.local()

def mydjangoview(request):
    # In your top-level view function, initialize the list
    threadlocal.g_variable = []
    # Then call the functions that use g_variable
    foo()
    bar()

    # ... and then I guess you probably return a response?
    return Response(...)

def foo():
    threadlocal.g_variable.append(some_value)

def bar():
    threadlocal.g_variable.append(some_other_value)

Other links:

  • Why is using thread locals in Django bad?
  • What is so bad with threadlocals
like image 116
codeape Avatar answered May 25 '26 08:05

codeape


That's how global variables work in Python. The global state persists for as long as the web application server keeps running.

A common solution would be to put your functions in a class, store the per-request state on that class and use a new instance of that class for each request.

like image 23
joeforker Avatar answered May 25 '26 08:05

joeforker