I have a Redis-Hash with millions of elements, constantly adding new ones. In php, I run an endless loop to get, process, and delete one element afteh the other. Hereby, I need to get the key of any existing element (preferebly the first one inserted in the hash, FiFo) <pre class="prettyprint"><code>while($redis->hlen()) { $key = ??? // process $key } </code></pre> While I know the <code>RANDOMKEY</code> and the <code>SRANDMEMBER</code> command, I did not find any way to get a key of a hash. <code>HGETALL</code> and <code>HKEYS</code>are, due to the size of the hash, not an option either. I need sequential processing. Help appreciated.

There is no trick to access a random item (or the first, or the last) of a given hash object. Should you need to iterate on hash objects, you have several possibilities: <ul> <li>a first one is to complement the hash with another data structure you can slice (like a list or a zset). If you only add items in the hash (and iterate to delete them) a list is enough. If you can add/remove/update items (and iterate to delete them) then a zset is required (put a timestamp as a score). Both list of zset can be sliced (lrange, zrange, zrangebyscore), so it is easy to iterate on them chunk by chunk, and maintain both data structures in sync.</li> <li>a second one is to complement the hash with another data structure supporting pop like operations, such as a list or a set (lpop, rpop, spop). Instead of iterating on the hash object, you can pop out all the objects from the secondary structure and maintain the hash object accordingly. Again, both data structures need to be kept in sync.</li> <li>a third one is to split the hash object in many pieces. This is actually memory efficient, because your keys are stored only once, and Redis can leverage ziplist memory optimization.</li> </ul> So instead of storing your hash as: <pre class="prettyprint"><code>myobject -> { key1:xxxx, key2:yyyyy, key3:zzzz } </code></pre> you can store: <pre class="prettyprint"><code>myobject:<hashcode1> -> { key1:xxxx, key3:zzzz } myobject:<hashcode2> -> { key2:yyyy } ... </code></pre> To calculate the extra hashcode, you can apply any hash function on your keys which offers a good distribution. On the above example, we assume key1 and key3 have the same hashcode1 value and key2 has the hashcode2 value. You can find more information about this kind of data structures here: Redis 10x more memory usage than data The output cardinality of the hash function should be calculated so that the number of items per hash object is limited to a given value. For instance, if we choose to have 100 items per hash objects, and we need to store 1M items, we will need a cardinality of 10K. To limit the cardinality, simply using a modulo operation on a generic hash function is enough. The benefit is it will be compact in memory (ziplist used), and you can easily iterate destructively on the hash objects by pipelining hgetall+del on all of them: <pre class="prettyprint"><code>hgetall myobject:0 ... at most 100 items will be returned, process them ... del myobject:0 hgetall myobject:1 ... at most 100 items will be returned, process them ... del myobject:1 ... </code></pre> You can therefore iterate chunk by chunk with a granularity which is determined by the output cardinality of the hash function.

Get random / any value from Redis hash

Tags:

php

redis

phpredis

I have a Redis-Hash with millions of elements, constantly adding new ones. In php, I run an endless loop to get, process, and delete one element afteh the other. Hereby, I need to get the key of any existing element (preferebly the first one inserted in the hash, FiFo)

while($redis->hlen()) {
    $key = ???
    // process $key    
}

While I know the RANDOMKEY and the SRANDMEMBER command, I did not find any way to get a key of a hash. HGETALL and HKEYSare, due to the size of the hash, not an option either. I need sequential processing. Help appreciated.

497

asked Jun 20 '13 10:06

Zsolt Szilagyi

1 Answers

There is no trick to access a random item (or the first, or the last) of a given hash object.

Should you need to iterate on hash objects, you have several possibilities:

a first one is to complement the hash with another data structure you can slice (like a list or a zset). If you only add items in the hash (and iterate to delete them) a list is enough. If you can add/remove/update items (and iterate to delete them) then a zset is required (put a timestamp as a score). Both list of zset can be sliced (lrange, zrange, zrangebyscore), so it is easy to iterate on them chunk by chunk, and maintain both data structures in sync.
a second one is to complement the hash with another data structure supporting pop like operations, such as a list or a set (lpop, rpop, spop). Instead of iterating on the hash object, you can pop out all the objects from the secondary structure and maintain the hash object accordingly. Again, both data structures need to be kept in sync.
a third one is to split the hash object in many pieces. This is actually memory efficient, because your keys are stored only once, and Redis can leverage ziplist memory optimization.

So instead of storing your hash as:

myobject -> { key1:xxxx, key2:yyyyy, key3:zzzz }

you can store:

myobject:<hashcode1> -> { key1:xxxx, key3:zzzz }
myobject:<hashcode2> -> { key2:yyyy }
...

To calculate the extra hashcode, you can apply any hash function on your keys which offers a good distribution. On the above example, we assume key1 and key3 have the same hashcode1 value and key2 has the hashcode2 value.

You can find more information about this kind of data structures here:

Redis 10x more memory usage than data

The output cardinality of the hash function should be calculated so that the number of items per hash object is limited to a given value. For instance, if we choose to have 100 items per hash objects, and we need to store 1M items, we will need a cardinality of 10K. To limit the cardinality, simply using a modulo operation on a generic hash function is enough.

The benefit is it will be compact in memory (ziplist used), and you can easily iterate destructively on the hash objects by pipelining hgetall+del on all of them:

hgetall myobject:0
... at most 100 items will be returned, process them ...
del myobject:0
hgetall myobject:1
... at most 100 items will be returned, process them ...
del myobject:1
...

You can therefore iterate chunk by chunk with a granularity which is determined by the output cardinality of the hash function.

180

answered Nov 15 '22 13:11

Didier Spezia

Related questions
                            
                                Uploaded Photo with iPad is rotate [closed]
                            
                                How can i calculate the total amount of Multiple products in Javascript? Values are coming in AJAX
                            
                                FILTER_SANITIZE_STRING is stripping the < character and any text after it
                            
                                Phpize under Windows
                            
                                Starting a multilingual site in Laravel
                            
                                PHP creating a multidimensional array of message threads from a multidimensional array (IMAP)
                            
                                How can I split a CSV file in PHP?
                            
                                How to manipulate URL inside ajax call?
                            
                                Running CodeIgniter Project
                            
                                Problems at postgresql full text search with words, containing single quotes
                            
                                YII migrations and by default values for table columns
                            
                                xlrd crashes when reading .xls file modified by PHPExcel
                            
                                Regex - matching all between second set of brackets ([])
                            
                                PHP: Enchant Spell Checking not working. Configuration in Windows?
                            
                                Translate numeric status codes in Symfony2 and SonataAdmin
                            
                                PHP Unzip very large file
                            
                                Why do we need to specify the parameter type in bindParam()?
                            
                                how to embed video in laravel code
                            
                                Laravel inserting and retrieving relationships
                            
                                disable page refresh on lose focus

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With