Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Python does not release memory (under mod_wsgi + Django)

I have Apache + mod_wsgi + Django app. mod_wsgi runs in daemon mode.

I have one view that fetches significant queryset from the DB and additionally allocates array by computing results of the queryset and then returns this array. I'm not using thread local storage, global variables or anything alike.

The problem is that my app eats memory relatively to the number threads I set for mod_wsgi.

I've made small experiment by setting various number of threads in mod_wsgi and then hitting my view by curl checking how far wsgi process can memory-climb.

It goes like this:

1 thread  - 256Mb
2 threads - 400Mb
3 threads - 535Mb
4 threads - 650Mb

So each thread add about 120-140Mb to the top memory usage.

I seems like the initial memory allocated for first request is never freed up. In single-thread scenario, its reused when second request (to the same view) is arrived. With that I can leave.

But when I use multiple threads, then when request is processed by a thread that never run this request before, this thread "saves" another 140mb somewhere locally.

  • How can fix this?
  • Probably Django saves some data in TSL. If that is the case, how can I disable it?
  • Alternatively, as a workaround, is it possible to bind request execution to a certain thread in mod_wsgi?

Thanks.

PS. DEBUG is set to False in settings.py

like image 635
Zaar Hai Avatar asked Oct 22 '13 12:10

Zaar Hai


1 Answers

In this sort of situation, what you should do is vertically partition your web application so that it runs across multiple mod_wsgi daemon process groups. That way you can tailor the configuration of the mod_wsgi daemon processes to the requirements of the subsets of URLs that you delegate to each. As the admin interface URLs of a Django application often have high transient memory usage requirements, yet aren't used very often, it can be recommended to do:

WSGIScriptAlias / /my/path/site/wsgi.py
WSGIApplicationGroup %{GLOBAL}

WSGIDaemonProcess main processes=3 threads=5
WSGIProcessGroup main

WSGIDaemonProcess admin threads=2 inactivity-timeout=60
<Location /admin>
WSGIProcessGroup admin
</Location>

So what this does is create two daemon process groups. By default URLs will be handled in the main daemon process group where the processes are persistent.

For the URLs for the admin interface however, they will be directed to the admin daemon process group, which can be set up with a single process with reduced number of threads, plus an inactivity timeout so that the process will be restarted automatically if the admin interface isn't used after 60 seconds, thereby reclaiming any excessive transient memory usage.

This will mean that submitting a requests to the admin interface can be slowed slightly if the processes had been recycled since the last time, as everything has to be loaded again, but since it is the admin interface and not a public URL, this is generally acceptable.

like image 176
Graham Dumpleton Avatar answered Oct 12 '22 03:10

Graham Dumpleton