A number of sources, including the official Redis documentation, note that using the KEYS
command is a bad idea in production environments due to possible blocking. If the approximate size of the dataset is known, does SCAN
have any advantage over KEYS
?
For example, consider a database with at most 100 keys of the form data:number:X
where X
is an integer. If I want to retrieve all of these, I might use the command KEYS data:number:*
. Is this going to be significantly slower than using SCAN 0 MATCH data:number:* COUNT 100
? Or are the two commands essentially equivalent in this circumstance? Would it be accurate to say that SCAN
is preferable to KEYS
because it protects against the scenario where an unexpectedly large set would be returned?
KEYS pattern Although the command has a very fast constant time. (i.e Redis running on a Personal laptop can scan a 1 million key database in 40 milliseconds.) Caution has to be taken while using this command in production environments having very stringent latency requirements due to it being a blocking call.
SCAN is a cursor based iterator. This means that at every call of the command, the server returns an updated cursor that the user needs to use as the cursor argument in the next call. An iteration starts when the cursor is set to 0, and terminates when the cursor returned by the server is 0.
Redis can handle up to 2^32 keys, and was tested in practice to handle at least 250 million keys per instance. Every hash, list, set, and sorted set, can hold 2^32 elements. In other words your limit is likely the available memory in your system.
The first command you can use to get the total number of keys in a Redis database is the DBSIZE command. This simple command should return the total number of keys in a selected database as an integer value. The above example command shows that there are 203 keys in the database at index 10.
You shouldn't care about current command execution but about the impact to all other commands, since Redis processes commands using a single thread (i.e. while a command is being executed all others need to await until executing one ends).
While keys
or scan
might provide you similar or identical performance executed alone in your case, some milliseconds blocking Redis will significantly decrease overall I/O.
This the main reason to use keys
for development purposes and scan
on production environments.
OP said:
"While keys or scan might provide you similar or identical performance executed alone in your case, some milliseconds blocking Redis will significantly decrease overall I/O." - This sentence seems to indicate that one command blocks Redis, and the other doesn't, which can't be the case. If I am guaranteed 100 results from my call to KEYS, in what way is it worse than SCAN? Why do you feel that one command is more prone to blocking?
There should be a good difference when you can paginate the search. It's not the same being forced to get 100 keys in a single pass than being able to implement pagination and get 100 keys, 10 by 10 (or 50 and 50). This very small interruption can let other commands sent by the application layer be processed by Redis. See what Redis official documentation says about this:
Since these commands allow for incremental iteration, returning only a small number of elements per call, they can be used in production without the downside of commands like KEYS or SMEMBERS that may block the server for a long time (even several seconds) when called against big collections of keys or elements
.
The answer is in the SCAN
documentation
These commands allow for incremental iteration, returning only a small number of elements per call, they can be used in production without the downside of commands like
KEYS
orSMEMBERS
that may block the server for a long time (even several seconds) when called against big collections of keys or elements.
So ask for small chunks of data rather than getting whole of it
Also as Matías Fidemraizer pointed out, Redis is single threaded and KEYS
is a blocking call thus blocking any incoming requests for operation until execution of KEYS
is done.
Whether your data is small or not, it never hurts to apply best practices.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With