Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cache huge data in-memory

I am looking for an in-memory cache solution which can handle big data (<5GB). For a user inputted search term, the database (elasticsearch) will return a large amount of data which the tool will analyze and show via different webpages of the tool. Now my problem is that I want to cache this big data temporarily till the user session gets over so that I don't have to fetch it again from elasticsearch every time the user opens a new page. It will have to be in-memory because disk based will take over a minute which would be very slow.

I initially thought memcached but it has a max limit of 128MB. After reading quite a bit, Redis seems suitable but it is unclear to me whether a bunch of Redis nodes can work in tandem or not. Is it possible to set up a pool of many Redis nodes so that a suitable node will be automatically chosen on SET and the data returned upon GET without me having to specify the node?

TL;DR

  • Problem: Cache big data (<5GB) in an in-memory cache
  • Possible solution: Redis
  • Question: Can I pool a bunch of Redis nodes so that I can fetch a key stored in any of them without specifying a particular node. I don't need to distribute my data since data for a single user will fit into the RAM of a single node.
like image 203
huhahihi Avatar asked Oct 31 '22 13:10

huhahihi


2 Answers

A Redis Cluster sounds like a good fit for your usecase!

Redis cluster provides a mechanism for data sharding by means of hash slots. These slots are equally distributed over the nodes in your cluster when setting it up.

Whenever you store a value in the cluser, the corresponding hash slot for the given key is calculated and the data is forwarded to the responsible node. And the same way you can afterwards query your data again. So the answer to your question is certainly yes.

However, the max value size per key is 512MB. I'm not sure if I got your storage requirement correctly. I assume 5GB is the estimated total amount over all users.

Checkout the redis cluster tutorial.

like image 189
Moritz Avatar answered Nov 12 '22 21:11

Moritz


You can also look into NCache(.net) / Tayzgrid(java) by Alachisoft,

Both of these solutions provide distributed caching with dynamic clustering which allows to add or remove nodes in cluster at runtime with out losing any data. Also intelligent client makes sure to refer to appropriate node to fetch/store a record against any key.

like image 38
Sameer Shah Avatar answered Nov 12 '22 22:11

Sameer Shah