I'm new to using distributed caching solutions like Memcached on a large web site, I have a couple questions and could someone who has experience on these comment please.
Obviously the amount of data I can put into cache depends on server RAM. Supposed I have big enough server farm and RAM, is there a max number of objects I can put into memcached before I start seeing performance degrades? The reason I ask is that I figure if I put literally millions of object into memcached wouldn't it take longer for it to index and look up objects? Is there a line to draw here.
Should I cache smaller but more objects in memcached, or bigger but less number of objects? Smaller objects do involve more round trips to DB to get them, but it is more flexible and easier to program.
Thank you very much,
Ray.
Two common approaches are cache-aside or lazy loading (a reactive approach) and write-through (a proactive approach). A cache-aside cache is updated after the data is requested. A write-through cache is updated immediately when the primary database is updated.
What is Memcached? Memcached is an easy-to-use, high-performance, in-memory data store. It offers a mature, scalable, open-source solution for delivering sub-millisecond response times making it useful as a cache or session store.
Memcached (pronounced variously mem-cash-dee or mem-cashed) is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source (such as a database or API) must be read.
Redis uses a single core and shows better performance than Memcached in storing small datasets when measured in terms of cores. Memcached implements a multi-threaded architecture by utilizing multiple cores. Therefore, for storing larger datasets, Memcached can perform better than Redis.
Memcached uses a hash internally to have an O(1) lookup - it's designed to be doing as little complicated work as possible.
As far as what to cache, big or small, it's really about what you need to store that will save you effort (bearing in mind it's a big dumb cache, you have to help keep it synchronised if you change one piece that is also referred to elsewhere). On the original site it was written for, Livejournal.com, the largest block that made sense was one complete journal entry - as the finished HTML that could be used by anyone that was allowed to see that particular post.
I've used it for some very small entries - literally a single number against a member-ID, but I'm generating a few thousand such entries en-mass with a single database query rather than one at a time as required.
You can optimise the daemon somewhat if you know that you will only be storing very large, or very small items, but for the many small entries, it has enough smarts to split empty large slabs of memory into smaller chunks for use.
Supposed I have big enough server farm and RAM, is there a max number of objects I can put into memcached before I start seeing performance degrades?
Ideally, your cache should be 100% full at all times. memcached uses a hashing algorithm to lookup keys, so as far as I know, there shouldn't be a performance penalty for storing more keys.
Should I cache smaller but more objects in memcached, or bigger but less number of objects?
I would imagine that bigger but fewer objects would be preferable to reduce the amount of time for both database and cache lookups, but you should take this on a case by case basis. Unless you know that the performance difference would be drastic, I'd recommend starting with what's easiest first and working from there if that isn't sufficient.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With