Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding global object persistence in Python WSGI apps

Consider the following code in my WebApp2 application in Google App Engine:

count = 0

class MyHandler(webapp2.RequestHandler):

    def get(self):

        global count
        count = count + 1
        print count

With each refresh of the page, the count increments higher.

I'm coming from the PHP world where every request was a new global environment. What I understand to be happening here is, because I'm using the wsgi configuration for WebApp2, Python does not kick off a new process on each request. If I was using a cgi configuration, on the other hand, the global environment would re-instantiate each time, like PHP...

Assuming the above is correct (If not, please correct me) ...

  1. How could I handle scenarios where I'd want a global variable that persisted only for the lifetime of the request? I could put an instance variable in the RequestHandler class, but what about things like utility modules that I import that use global vars for things like storing a message object?
  2. Is there some sort of technique to reset all variables, or to force a re-instantiation of the environment?
  3. Does the global environment persist indefinitely, or does it reset itself at some point?
  4. Is any of this GAE specific, or does wsgi global persistance work the same in any server scenario?

EDIT:

Here's an attempt using threadlocal:

count = 0

mydata = threading.local()
mydata.count = 0

class MyHandler(webapp2.RequestHandler):

    def get(self):

        global count
        count = count + 1
        print count

        mydata.count = mydata.count + 1
        print mydata.count

These also increment across requests

like image 971
Yarin Avatar asked Nov 17 '11 16:11

Yarin


2 Answers

Your understanding is correct. If you want variables that persist for the duration of the request, you shouldn't make them globals at all - make them instance variables on your RequestHandler class, accessed as self.var. Since a new RequestHandler is instantiated for each request, your variables will stick around exactly as long as you need them to. Global variables are best avoided unless you really do need global (as opposed to request-specific) scope.

Also note that your App Engine app will run on multiple servers; globals are only accessible to requests inside the same server.

like image 124
Nick Johnson Avatar answered Sep 20 '22 09:09

Nick Johnson


Your analysis of the situation is correct, a Python web app is a long-running process. It takes a long time to spin up the Python interpreter and is not done every request.

It's entirely possible to create a global variable that is different "per-request". This is done in a lot of frameworks and people seem to like it. The way to do this does depend on the server. Most servers use "one thread per request", and I think GAE does as well. If this is the case you can use a threadlocal variable. If you are worried about this value sticking around between requests on that thread, you will need some management code that can hook to the start/end of a request. WSGI middleware is a good place for this if the WebApp2 framework doesn't provide a nice way to do it.

It's just Python, and a request is served in its own thread. From there you can do what you'd like. There's nothing in Python to just reset all the global variables, and there's generally no guarantee (especially with GAE) that the process serving your request will be the same process everytime, meaning your globals shouldn't be used to persist data between requests unless you really know what you're doing.

There are many frameworks out there that provide good support for doing this already, so if WebApp2 doesn't then I'd suggest looking elsewhere. Python has a lot of options and many of them do run on GAE.

like image 5
Michael Merickel Avatar answered Sep 23 '22 09:09

Michael Merickel