Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In-memory cache VS. centralized cache in a distributed system

We're currently looking for the most suitable solution for accessing critical data on a distributed system, and we're considering whether to use in memory caching, versus a centralized cache.

Some information about the data we wish to store/access:

  • Very small data size
  • Data is very cold; meaning it barely changes, and only changes when a human changes something in our back office system
  • Has to be up to date when changed (a few 100ms delay is OK)
  • Very critical path of our application, requires very high SLA (both in reliability and in response times (no more than 20ms to access))
  • Data is read from frequently (up to thousands of times per second)

The way we see it is as following -

In memory cache

Pros:

  • Quicker than network access + serialization
  • Higher reliability in terms of distribution (if one instance dies, the data still exists on the other instances)

Cons:

  • Much more complex to code and maintain
  • Requires notifying instances once a change occurs and need to update each instance seperately + Need to load data on start of each server
  • Adds a high risk of data inconsistency (one instance having different or outdated data than others)

Centralized cache

For the sake of conversation, we've considered using Redis.

Pros:

  • Much simpler to maintain
  • Very reliable, we have a lot of experience working with Redis in a distributed system
  • Only one place to update
  • Assures data consistency

Cons:

  • Single point of failure (This is a big concern for us); even though if we go with this solution, we will deploy a cluster
  • What happens if cache is flushed for some reason
like image 742
Ron Avatar asked Jun 27 '16 14:06

Ron


People also ask

What is the difference between memory cache and distributed cache?

Memory is pooled into a single data store or data cache to provide faster access to data. Distributed caches are typically housed in a single physical server kept on site.

What is centralized cache?

Centralized cache management enables you to specify paths to directories that are cached by HDFS, thereby improving performance for applications that repeatedly access the same data. Centralized cache management in HDFS is an explicit caching mechanism.

What is distributed memory cache?

The Distributed Memory Cache (AddDistributedMemoryCache) is a framework-provided implementation of IDistributedCache that stores items in memory. The Distributed Memory Cache isn't an actual distributed cache. Cached items are stored by the app instance on the server where the app is running.


1 Answers

I don't find any problem in going for a centralized cache using Redis.

  1. Anyway you are going to have a cluster setup so if a master fails slave will take up the position.
  2. If cache is flushed for some reason then you have to build the cache, in the mean time requests will get data from the primary source (DB)
  3. You can enable persistence and load the data persisted in the disk and can get the data in seconds(plug and play). If you think you will have inconsistency then follow the below method.

Even if cache is not available system should work (with delayed time obviously). Meaning app logic should check for cache in redis if it's not there or system itself is not available it should get the value from dB and then populate it to redis and then serve to the client.

In this way even if your redis master and slave are down your application will work fine but with a delay. And also your cache will be up to date.

Hope this helps.

like image 160
Karthikeyan Gopall Avatar answered Sep 17 '22 21:09

Karthikeyan Gopall