If I have a pretty static set of data that I want to be able to access as quickly as possible. Should I cache the data into Memcached or should I store it in a HEAP table or something inside MySQL? Would one scale better than the other?
Is there some other option that's even faster?
memcached can process over 50 million keys per second on a 48 core machine using only RAM and heavy batching.
Unlike databases that store data on disk or SSDs, Memcached keeps its data in memory. By eliminating the need to access disks, in-memory key-value stores such as Memcached avoid seek time delays and can access data in microseconds. Memcached is also distributed, meaning that it is easy to scale out by adding new nodes.
The expiration time in Memcached is in seconds. For instance, the default value is 10800 seconds. But, it can have a maximum value of 2592000 seconds that is, 30 days.
Key Difference – Memcached vs Redis A relational database is a common database type, but it is not suitable for storing a large quantity of data. Therefore, NoSQL was introduced. It stands for a non-relational or non-SQL. Memcached and Redis are categorized as NoSQL.
The fastest option would be in-memory caching on the local system. That won't scale well to many millions of relations, but will be very fast and work well for small data sets.
I haven't done performance testing between Memcached/MySQL HEAP, but I'd guess Memcached would be faster because it doesn't have the overhead of a full relational DB engine. Memcached would almost certainly scale better, because you could distribute it between servers and have a round-robin request dispatch between them.
If you need to perform any filtering on the data before retrieving it, you should use MySQL. The performance overhead of transmitting unwanted data will probably outweigh the benefits of faster lookups.
If I were you, I'd load the data set in question into MySQL and Memcached, then run performance tests to see which is better for your data set. If there's a core of data that's accessed particularly often, consider an additional machine-local cache.
memcached will be faster for simple uses, hands down -- connection setup is so much cheaper on memcached, since there's no auth, buffer allocation, etc. Also, memcached is designed to easily distribute keys between multiple servers.
However, memcached is only a simple key/value store. If you need to do anything more complex to your data (even something like SELECT * WHERE x > 5), a HEAP table is much more powerful.
Robert Munteanu brings up a good point though. Your cache hierarchy should be:
If you don't need to propagate global changes to this data, then storing it in APC makes sense. If you need to access it several times during script execution, you should also cache it in globals in your script.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With