Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of concurrency in Django (apache2 prefork/mod_wsgi), what am I doing wrong?

First of all I am not in any way unhappy with the performance of my Django powered site, its not getting massive traffic, a bit over 1000 visits per day so far.

I was curious how well it would cope with heavy traffic peaks so I used the ab-tool to do some benchmarking.

I noticed that the performance when the concurrency is larger than 1 delivers the same ammount of request as 1 concurrent connection.

Shouldn't the reqs/s increase with concurrency?

Im on a virtual machine with 1 GB of RAM, apache2 (prefork), mod_wsgi, memcached and mysql.
All content on the page has been cached, database does not take any hits. And if memcached would drop the entry, theres only 2 light (indexed) queries - and should immediately be re-cached.

Benchmarking data: (note: i did benchmark it with 2000 and 10k requests with the same results)

For the startpage, served through apache2/mod_wsgi by django:
-n100 -c4: http://dpaste.com/97999/ (58.2 reqs/s)
-n100 -c1: http://dpaste.com/97998/ (57.7 reqs/s)

For robots.txt, directly from apache2:
-n100 -c4: http://dpaste.com/97992/ (4917 reqs/s)
-n100 -c1: http://dpaste.com/97991/ (1412 reqs/s)

This is my apache conf: http://dpaste.com/97995/

Edit: Added more information

wsgi.conf: http://dpaste.com/98461/

mysite.conf: http://dpaste.com/98462/

My wsgi-handler:

import os, sys
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
like image 870
schmilblick Avatar asked Sep 25 '09 07:09

schmilblick


1 Answers

As you are using prefork MPM and mod_wsgi in embedded mode with lots of processes, you are possibly killing the performance of your box. For a start, suggest you read:

http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html

Using embedded mode like you are, you need to tune your MPM parameters carefully. Setting MaxRequestsPerChild to be non zero is not a good start as you are going to periodically force out the Apache processes, with the result that you will cause a load spike as everything has to reload.

Would suggest worker MPM and with your Python web application running in mod_wsgi daemon mode. This for a start will result in a lot less processes being run, less memory overhead, and give more predictability around the performance of the system. Can then start to look more closely at why things may be running slower.

One thing to pay attention to is what you get for the following section of 'ab' output:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     0    0   0.2      0       2
Waiting:        0    0   0.1      0       2
Total:          0    0   0.2      0       2

If the max column shows large values, then you are getting hit by the application loading costs due to your either not eliminating them from your tests through preloading, or by short process restart interval.

like image 77
Graham Dumpleton Avatar answered Sep 20 '22 12:09

Graham Dumpleton