Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get all keys in Redis database with python

There is a post about a Redis command to get all available keys, but I would like to do it with Python.

Any way to do this?

like image 747
tscizzle Avatar asked Mar 07 '14 16:03

tscizzle


People also ask

How do I get all Redis keys?

To list the keys in the Redis data store, use the KEYS command followed by a specific pattern. Redis will search the keys for all the keys matching the specified pattern. In our example, we can use an asterisk (*) to match all the keys in the data store to get all the keys.

How do I query Redis in Python?

Using Redis-Py package. Create a Python file and add the code shown below to connect to the Redis cluster. Once we have a connection to the server, we can start performing operations. The above example will connect to the database at index 10. The line above will take the first arguments as key and value, respectively.

How do I view Redis data?

A Redis server has 16 databases by default. You can check the actual number by running redis-cli config get databases. In interactive mode, the database number is displayed in the prompt within square braces. For example, 127.0. 0.1:6379[13] shows that the 13th database is in use.


1 Answers

Use scan_iter()

scan_iter() is superior to keys() for large numbers of keys because it gives you an iterator you can use rather than trying to load all the keys into memory.

I had a 1B records in my redis and I could never get enough memory to return all the keys at once.

SCANNING KEYS ONE-BY-ONE

Here is a python snippet using scan_iter() to get all keys from the store matching a pattern and delete them one-by-one:

import redis r = redis.StrictRedis(host='localhost', port=6379, db=0) for key in r.scan_iter("user:*"):     # delete the key     r.delete(key) 

SCANNING IN BATCHES

If you have a very large list of keys to scan - for example, larger than >100k keys - it will be more efficient to scan them in batches, like this:

import redis from itertools import izip_longest  r = redis.StrictRedis(host='localhost', port=6379, db=0)  # iterate a list in batches of size n def batcher(iterable, n):     args = [iter(iterable)] * n     return izip_longest(*args)  # in batches of 500 delete keys matching user:* for keybatch in batcher(r.scan_iter('user:*'),500):     r.delete(*keybatch) 

I benchmarked this script and found that using a batch size of 500 was 5 times faster than scanning keys one-by-one. I tested different batch sizes (3,50,500,1000,5000) and found that a batch size of 500 seems to be optimal.

Note that whether you use the scan_iter() or keys() method, the operation is not atomic and could fail part way through.

DEFINITELY AVOID USING XARGS ON THE COMMAND-LINE

I do not recommend this example I found repeated elsewhere. It will fail for unicode keys and is incredibly slow for even moderate numbers of keys:

redis-cli --raw keys "user:*"| xargs redis-cli del 

In this example xargs creates a new redis-cli process for every key! that's bad.

I benchmarked this approach to be 4 times slower than the first python example where it deleted every key one-by-one and 20 times slower than deleting in batches of 500.

like image 130
Patrick Collins Avatar answered Sep 22 '22 22:09

Patrick Collins