Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redis — best way to store a large map (dictionary)

Tags:

redis

What I need to do is to store a one-to-one mapping. The dataset consists of a large number of key-value pairs of the same kind (10M+). For example, one could use a single instance of HashMap object in Java for storing such data.

The first way to do this is to store lots of key-value pairs, like this:

SET map:key1 value1
...
SET map:key900000 value900000
GET map:key1

The second option is to use a single "Hash":

HSET map key1 value
...
HSET map key900000 value900000
HGET map key1

Redis Hashes have some convenient commands (HMSET, HMGET, HGETALL, etc.), and they don't pollute the keyspace, so this looks like a better option. However, are there any performance or memory considerations when using this approach?

like image 385
Max Malysh Avatar asked May 06 '15 21:05

Max Malysh


1 Answers

Yes, as Itamar Haber says, you should look at this redis memory optimization guide. But you should also keep in mind a few more things:

  1. Prefer HSET to KEYS. Redis consumes a lot of memory just on key space management. In simple (and rough) terms, 1 HSET with 1,000,000 keys consumes up to 10x less memory than 1,000,000 keys with one value each.
  2. Keep HSET size less then hash-max-zipmap-entries and valid hash-max-zipmap-value if memory is the main target. Be sure to understand what hash-max-zipmap-entries and hash-max-zipmap-value mean. Also, take some time to read about ziplist.
  3. You actually do not want to handle hash-max-zipmap-entries with 10M+ keys; instead, you should break one HSET into multiple slots. For example, you set hash-max-zipmap-entries as 10,000. So to store 10M+ keys you need 1000+ HSET keys with 10,000 each. As a rough rule of thumb: crc32(key) % maxHsets.
  4. Read about strings in redis and use a KEY name (in HSET) length based on real memory management for this structure. In simple terms, keeping key length under 7 bytes, you spend 16 bytes per key, but an 8-byte key spends 48 bytes each. Why? Read about simple dynamic strings.

It may be useful to read about:

  • Redis Memory Optimization (from sripathikrishnan)
  • Comments about internal ziplist structure.
  • Storing hundreds of millions of simple key-value pairs in Redis (Instagram)
like image 187
Nick Bondarenko Avatar answered Nov 16 '22 23:11

Nick Bondarenko