I have a Django app that gets near-realtime data (tweets and votes), although updates occur only every minute or two on average. However we want to show the data by updating the site and api results right when it comes in.
We might see a whole ton of load on this site, so my initial thought is of course caching!
Is it practical to have some sort of Memcached cache that gets invalidated manually by another process or event? In other words, I would cache views for a long time, and then have new tweets and votes invalidate the entire view.
I'm not concerned about invalidating only some of the objects, and I considered subclassing the MemcachedCache
backend to add some functionality following this strategy. But of course, Django's sessions also use Memcached as a write through cache, and I don't want to invalidate that.
Cache invalidation is probably the best way to handle the stuff you're trying to do. Based on your question's wording, I'm going to assume the following about your app:
Assuming the above two things are true, cache invalidation is definitely the way to go. Here's the best way to do it in Django:
This is essentially what Django signals are meant for. They'll run automatically after your object is saved / updated, which is a great time to update your cache stores with the freshest information.
Doing it this way means that you'll never need to run a background job that periodically scans your database and updates your cache--your cache will always be up-to-date instantly with the latest data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With