Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory efficient way to store 32 bit signed integer in Redis

Tags:

redis

Since Redis try to parse strings to 64 bit signed integers, is it a good idea to store binary representation of 32 bit signed integer instead of radix 10 integer strings ?

In our system we have lists of many 32 bit signed integer IDs.

I can store them like
lpush mykey 102450  --> redis cast 102450 to 8 bytes long

or store it like 
lpush mykey  \x00\x01\x19\x32  ---> this is just 4 bytes
like image 428
Aresn Avatar asked Apr 22 '13 00:04

Aresn


People also ask

How much memory does Redis use per key?

Redis compiled with 32 bit target uses a lot less memory per key, since pointers are small, but such an instance will be limited to 4 GB of maximum memory usage. To compile Redis as 32 bit binary use make 32bit .

Should I switch from Redis to Redis 32-bit?

For the 32-bit Redis variant, any key name larger than 32 bits requires the key to span to multiple bytes, thereby increasing the memory usage. If your data size is expected to increase more than 3 GB then you should avoid switching.

How much memory does JSON take in Redis?

In one of our Redis data store we are storing pre-processed user data as Json for faster serving. On an average on user takes 12KB of memory without compression in our system, as number of users increased our Redis server could not hold up.

What is the resident set size of a Redis instance?

For example if you fill an instance with 5GB worth of data, and then remove the equivalent of 2GB of data, the Resident Set Size (also known as the RSS, which is the number of memory pages consumed by the process) will probably still be around 5GB, even if Redis will claim that the user memory is around 3GB.


2 Answers

Internally, Redis stores strings in the most efficient manner. Forcing integers into radix 10 strings will actually use more memory.

Here is how Redis stores Strings -

  1. Integers less than 10000 are stored in a shared memory pool, and don't have any memory overheads. If you wish, you can increase this limit by changing the constant REDIS_SHARED_INTEGERS in redis.h and recompiling Redis.
  2. Integers greater than 10000 and within range of a long consume 8 bytes.
  3. Regular strings take len(string) + 4 bytes for length + 4 bytes for marking free space + 1 byte for null terminator + 8 bytes for malloc overheads.

In the example you quoted, its a question of 8 bytes for a long v/s 21 bytes for the string.

EDIT :

So if I have a set of numbers all less than 10,000 how does Redis store my set?

It depends on how many elements you have.

If you have less than 512 elements in your set (see set-max-intset-entries), then the set will be stored as an IntSet. An IntSet is a glorified name for a Sorted Integer Array. Since your numbers are less than 10000, it would use 16 bits per element. It is (almost) as memory efficient as a C array.

If you have more than 512 elements, the set becomes a HashTable. Each element in the set is wrapped in a structure called robj, which has an overhead of 16 bytes. The robj structure has a pointer to the shared pool of integers, so you don't pay anything extra for the integer itself. And finally, the robj instances are stored in the hashtable, and the hashtable has an overhead that is proportional to the size of the set.

If you are interested in exactly how much memory an element consumes, run redis-rdb-tools on your dataset (disclaimer: I am the author of this tool). Or you can read the sourcecode for the class MemoryCallback, the comments explain how the memory is laid out.

like image 158
Sripathi Krishnan Avatar answered Oct 04 '22 14:10

Sripathi Krishnan


Strings are stored with a length, so it won't be just 4 bytes in the database -- it's probably stored as 4 bytes data + 4 bytes length + padding, so you don't gain anything.

like image 35
rmmh Avatar answered Oct 04 '22 14:10

rmmh