Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caching paginated results, purging on update - how to solve?

I've created a forum, and we're implementing an apc and memcache caching solution to save the database some work.

I started implementing the cache layer with keys like "Categories::getAll", and if I had user-specific data, I'd append the keys with stuff like the user ID, so you'd get "User::getFavoriteThreads|1471". When a user added a new favorite thread, I'd delete the cache key, and it would recreate the entry.

However, and here comes the problem:

I wanted to cache the threads in a forum. Simple enough, "Forum::getThreads|$iForumId". But... With pagination, I'd have to split this into several cache entries, for example

"Forum::getThreads|$iForumId|$iLimit|$iOffset".

Which is alright, until someone posts a new thread in the forum. I will now have to delete all the keys under "Forum::getThreads|$iForumId", no matter what the limit and offset is.

What would be a good way of solving this problem? I'd really rather not loop through every possible limit and offset until I find something that doesn't match anymore.

Thanks.

like image 575
Rexxars Avatar asked Sep 20 '08 21:09

Rexxars


4 Answers

Just an update: I decided that Josh's point on data usage was a very good one. People are unlikely to keep viewing page 50 of a forum.

Based on this model, I decided to cache the 90 latest threads in each forum. In the fetching function I check the limit and offset to see if the specified slice of threads is within cache or not. If it is within the cache limit, I use array_slice() to retrieve the right part and return it.

This way, I can use a single cache key per forum, and it takes very little effort to clear/update the cache :-)

I'd also like to point out that in other more resource heavy queries, I went with flungabunga's model, storing the relations between keys. Unfortunately Stack Overflow won't let me accept two answers.

Thanks!

like image 145
Rexxars Avatar answered Sep 27 '22 23:09

Rexxars


You might also want to have a look at the cost of storing the cache data, in terms of your effort and CPU cost, against how what the cache will buy you.

If you find that 80% of your forum views are looking at the first page of threads, then you could decide to cache that page only. That would mean both cache reads and writes are much simpler to implment.

Likewise with the list of a user's favourite threads. If this is something that each person visits rarely then cache might not improve performance too much.

like image 31
Josh Avatar answered Sep 27 '22 22:09

Josh


I've managed to solve this by extending the memcache class with a custom class (say ExtendedMemcache) which has a protected property which will contain a hash table of group to key values.

The ExtendedMemcache->set method accepts 3 args ($strGroup,$strKey, $strValue) When you call set, it will store the relationship between $strGroup, and $strKey, in the protected property and then go on to store the $strKey to $strValue relationship in memcache.

You can then add a new method to the ExtendedMemcache class called "deleteGroup", which will, when passed a string, find that keys associated to that group, and purge each key in turn.

It would be something like this: http://pastebin.com/f566e913b I hope all that makes sense and works out for you.

PS. I suppose if you wanted to use static calls the protected property could be saved in memcache itself under it's own key. Just a thought.

like image 32
flungabunga Avatar answered Sep 27 '22 22:09

flungabunga


You're essentially trying to cache a view, which is always going to get tricky. You should instead try to cache data only, because data rarely changes. Don't cache a forum, cache the thread rows. Then your db call should just return a list of ids, which you already have in your cache. The db call will be lightening fast on any MyISAM table, and then you don't have to do a big join, which eats db memory.

like image 42
Brent Avatar answered Sep 27 '22 23:09

Brent