Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to invalidate parts of a hierarchy (tree) of data in Redis cache

Tags:

caching

redis

I have some product data that I need to store multiple versions of in a Redis cache. The data is made up of JSON-serialised objects. The process of obtaining the plain (basic) data is expensive, and the process of customising it into different versions is also expensive, so I'd like to cache all versions to optimise wherever possible. The data structure looks something like this:

                                    BaseProduct
                                         /\
                                      /      \
                                   /            \
                                /                  \
                             /                        \
           CustomisedProductA                          CustomisedProductB
                  /  \                                       /  \
CustomisedProductA1  CustomisedProductA2   CustomisedProductB1  CustomisedProductB2

The general idea here is:

  • There is a base product stored in a database.
  • One level of customisation can be applied to this product - e.g. information about a specific version of this product for a sales region.
  • A second level of customisation can be applied within that - e.g. information about this product at a particular store within a region.

The data is stored in this way because each step of the data retrieval/calculation process is expensive. The first time a particular product is retrieved for a region, there will be one set of customisations performed to make it into a region-specific product. The first time a particular product is retrieved for a store, I need to perform customisations based on the regional product to generate the store-specific product.

The problem comes in due to the fact that I may need to invalidate data in a few ways:

  • If the base product data changes, then the whole tree needs to be invalidated and everything needs to be regenerated. I can achieve this by storing the whole structure in a hash and deleting the hash by its key.
  • If the first set of customisations for a product change (i.e. the middle level), then I need to invalidate the nodes underneath this level too. For example, if the customisations for CustomisedProductA are affected by a change, I need to expire CustomisedProductA, CustomisedProductA1, and CustomisedProductA2.
  • If the second set of customisations for a product change (i.e. the bottom level), then that node needs to be invalidated. I can achieve this in a hash by calling HDEL key field (e.g. HDEL product CustomisedProductA:CustomisedProductA1).

My question is therefore: is there a way of representing this type of multi-level data structure, to allow for the performance of storing the data in multiple levels while enabling invalidation of only parts of the tree? Or, am I limited to expiring the entire tree (DEL key) or specific nodes (HDEL key field) but nothing in between?

like image 890
John Avatar asked Sep 21 '25 00:09

John


1 Answers

There are at least 3 different ways for doing that, each has its own pros and cons.

The first approach is to use non-atomic ad-hoc scanning of the tree to identify and invalidate (delete) the tree's 2nd level (1st set of customizations). To do that, use a hierarichal naming scheme for your Hash's fields and iterate through them using HSCAN. For example, assuming that your Hash's key name is the product's ID (e.g. ProductA), you'd use something like '0001:0001' as the field name for the first customization's first version, '0001:0002' for its second version and so forth. Similarly, '0002:0001' would be the 2nd customization 1st version, etc... Then, do find all of customization 42's versions, use HSCAN ProductA 0 MATCH 0042:*, HDEL the fields in the reply, and repeat until the cursor zeros.

The opposite approach is to proactively "index" each customization's versions so you can fetch them efficiently instead of performing the Hash's full scan. The way to go about that is using Redis' Sets - you keep a Set with all the field names for a given product's version. Versions can either be sequential (as in my example) or anything else as long as they are unique. The cost is maintaining these indices - whenever you add or remove a product's customization and/or version, you'll need to maintain consistency with these Sets. For example, the creation of a version would be something like:

HSET ProductA 0001:0001 "<customization 1 version 1 JSON payload"
SADD ProductA:0001 0001

Note that these two operations should be in a single transaction (i.e. use a MULTI\EXEC block or EVAL a Lua script). When you have this set up, invalidating a customization is just a matter of calling SMEMBERS on the relevant Set and deleting the versions in it from the Hash (and the Set itself as well). It is important to note, however, that reading all members from a large Set could be time consuming - 1K members isn't that bad, but for larger Sets there's SSCAN.

Lastly, you could consider using a Sorted Set instead of a Hash. While perhaps less intuitive in this use case, the Sorted Set will let you perform all the operations you need. The price for using it, however, is the increased complexity of O(logN) for adding/removing/reading compared to the Hash's O(1), but given the numbers the difference isn't significant.

To unleash the Sorted Set's power, you'll use lexicographical ordering so all of the Sorted Set's members should have the same score (e.g. use 0). Each product will be represented by a Sorted Set, just like with the Hash. The members of the Set are the equivalents of the Hash's field, namely customizations' versions. The "trick" is constructing the members in a way that allows you to perform range searches (or level-2 invalidations if you will). Here's an example of how it should look like (note that here the key ProductA isn't a Hash but a Sorted Set):

ZADD ProductA 0 0001:0001:<JSON>

To read a customization version, use ZRANGEBYLEX ProductA [0001:0001: [0001:0001:\xff and split the JSON from the reply and to remove an entire customization, use ZREMRANGEBYLEX.

like image 84
Itamar Haber Avatar answered Sep 23 '25 05:09

Itamar Haber