Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to measure Django cache performance?

I have a rather small (ca. 4.5k pageviews a day) website running on Django, with PostgreSQL 8.3 as the db.

I am using the database as both the cache and the sesssion backend. I've heard a lot of good things about using Memcached for this purpose, and I would definitely like to give it a try. However, I would like to know exactly what would be the benefits of such a change: I imagine that my site may be just not big enough for the better cache backend to make a difference. The point is: it wouldn't be me who would be installing and configuring memcached, and I don't want to waste somebody's time for nothing or very little.

How can I measure the overhead introduced by using the db as the cache backend? I've looked at django-debug-toolbar, but if I understand correctly it isn't something you'd like to put on a production site (you have to set DEBUG=True for it to work). Unfortunately, I cannot quite reproduce the production setting on my laptop (I have a different OS, CPU and a lot more RAM).

Has anyone benchmarked different Django cache/session backends? Does anybody know what would be the performance difference if I was doing, for example, one session-write on every request?

like image 579
Ryszard Szopa Avatar asked May 06 '09 08:05

Ryszard Szopa


3 Answers

At my previous work we tried to measure caching impact on site we was developing. On the same machine we load-tested the set of 10 pages that are most commonly used as start pages (object listings), plus some object detail pages taken randomly from the pool of ~200000. The difference was like 150 requests/second to 30000 requests/second and the database queries dropped to 1-2 per page.

What was cached:

  • sessions
  • lists of objects retrieved for each individual page in object listing
  • secondary objects and common content (found on each page)
  • lists of object categories and other categorising properties
  • object counters (calculated offline by cron job)
  • individual objects

In general, we used only low-level granular caching, not the high-level cache framework. It required very careful design (cache had to be properly invalidated upon each database state change, like adding or modifying any object).

like image 137
zgoda Avatar answered Sep 29 '22 05:09

zgoda


The DiskCache project publishes Django cache benchmarks comparing local memory, Memcached, Redis, file based, and diskcache.DjangoCache. An added benefit of DiskCache is that no separate process is necessary (unlike Memcached and Redis). Instead cache keys and small values are memory-mapped into the Django process memory. Retrieving values from the cache is generally faster than Memcached on localhost. A number of settings control how much data is kept in memory; the rest being paged out to disk.

like image 22
GrantJ Avatar answered Sep 29 '22 07:09

GrantJ


Short answer : If you have enougth ram, memcached will be always faster. You can't really benchhmark memcached vs. database cache, just keep in mind that the big bottleneck with servers is disk access, specially write access.

Anyway, disk cache is better if you have many objects to cache and long time expiration. But for this situation, if you want gig performances, it is better to generate your pages statically with a python script and deliver them with ligthtpd or nginx.

For memcached, you could adjust the amount of ram dedicated to the server.

like image 25
fredz Avatar answered Sep 29 '22 05:09

fredz