So I'm on a quest to build a lua script that uses SCAN to find keys based on a pattern and Delete them (atomically). I first prepared the following script
local keys = {};
local done = false;
local cursor = "0"
repeat
local result = redis.call("SCAN", cursor, "match", ARGV[1], "count", ARGV[2])
cursor = result[1];
keys = result[2];
for i, key in ipairs(keys) do
redis.call("DEL", key);
end
if cursor == "0" then
done = true;
end
until done
return true;
Which would spit back the following "Err: @user_script: 9: Write commands not allowed after non deterministic commands " So I thought about it a bit and came up with the following script:
local all_keys = {};
local keys = {};
local done = false;
local cursor = "0"
repeat
local result = redis.call("SCAN", cursor, "match", ARGV[1], "count", ARGV[2])
cursor = result[1];
keys = result[2];
for i, key in ipairs(keys) do
all_keys[#all_keys+1] = key;
end
if cursor == "0" then
done = true;
end
until done
for i, key in ipairs(all_keys) do
redis.call("DEL", key);
end
return true;
which still returns the same error (@user_script: 17: Write commands not allowed after non deterministic commands). This has me stumped. Is there any way to circumvent this issue?
script was run using phpredis and the following
$args_arr = [
0 => 'test*', //pattern
1 => 100, //count for SCAN
];
var_dump($redis->eval($script, $args_arr, 0));
Redis EVAL command is used to evaluate scripts using the Lua interpreter. The first argument of EVAL is a Lua 5.1 script. The script does not need to define a Lua function (and should not). It is just a Lua program that will run in the context of the Redis server.
Why use Lua scripts? Lua scripts allow you to evaluate data within Redis and make decisions based on the parameters you pass and the stored data. This means that logic that would have been executed within your microservice could be instead be executed inside Redis.
The Redis SCAN command is used in order to incrementally iterate over a collection of elements. Basic usage of SCAN. SCAN is a cursor based iterator. This means that at every call of the command, the server returns an updated cursor that the user needs to use as the cursor argument in the next call.
Executing Lua in Redis. Redis lets users upload and execute Lua scripts on the server. Scripts can employ programmatic control structures and use most of the commands while executing to access the database. Because scripts execute in the server, reading and writing data from scripts is very efficient.
UPDATE: the below applies to Redis versions up to 3.2. From that version, effect-based replication lifts the ban on non-determinism so all bets are off (or rather, on).
You can't (and shouldn't) mix the SCAN
family of commands with any write command in a script because the former's reply is dependent on the internal Redis data structures that, in turn, are unique to the server process. Put differently, two Redis processes (e.g. master and slave) are not guaranteed to return the same replies (so in Redis replication context [which isn't operation- but statement-based] that would break it).
Redis tries to protect itself against such cases by blocking any write command (such as DEL
) if it is executed after a random command (e.g. SCAN
but also TIME
, SRANDMEMBER
and similar). I'm sure there are ways to get around that, but would you want to do that? Remember, you'll be going into unknown territory where the system's behavior is not defined.
Instead, accept the fact that you shouldn't mix random reads and writes and try to think of a different approach for solving your problem, namely deleting a bunch of keys according to a pattern in an atomic way.
First ask yourself if you can relax any of the requirements. Does it have to be atomic? Atomicity means that Redis will be blocked for the duration of the deletion (regardless the final implementation) and that the length of the operation depends on the size of the job (i.e. number of keys that are deleted and their contents [deleting a large set is more expensive than deleting a short string for example]).
If atomicity isn't a must, periodically/lazily SCAN
and delete in small batches. If it is a must, understand that you're basically trying to emulate the evil KEYS
command :) But you can do better if you have prior knowledge of the pattern.
Assuming the pattern is known during runtime of your application, you can collect the relevant keys (e.g. in a Set) and then use that collection to actualize the delete in an atomic and replication-safe manner that's more efficient compared to going over the entire keyspace.
However, the most "difficult" problem is if you need to run ad-hoc pattern matching while ensuring atomicity. If so, the problem boils down to obtaining a filtered-by-pattern snapshot of the keyspace immediately followed by a succession of deletes (re-emphasizing: while the database is blocked). In that case you can very well use KEYS
within your Lua script and hope for the best... (but knowing full well that you may resort to SHUTDOWN NOSAVE
quite quickly :P).
The Last Optimization is to index the keyspace itself. Both SCAN
and KEYS
are basically full table scans, so what if we were to index that table? Imagine keeping an index on keys' names that can be queried during a transaction - you can probably use a Sorted Set and lexicographical ranges (HT @TwBert) to do away with most of the pattern matching needs. But at a significant cost... not only will you be doing double bookkeeping (storing each key's name costs in RAM and CPU), you'd be forced to add complexity to your application. Why adding complexity? Because to implement such an index you'd have to maintain it yourself in the application layer (and possibly all your other Lua scripts), carefully wrapping each write operation to Redis in a transaction that also updates the index.
Assuming you did all that (and taking into account the obvious pitfalls like the added complexity's potential for bugs, at-least doubled write load on Redis, RAM & CPU, restrictions on scaling and so forth...) you can pat yourself on the shoulder and congratulate yourself for using Redis in a way that it wasn't designed for. While upcoming versions of Redis may (or may not) include better solutions for this challenge (@TwBert - want to do a joint RCP/contrib and again hack Redis a little?), before trying this I really urge you to rethink the original requirements and verify that you're using Redis correctly (i.e. designing your "schema" according to your data access needs).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With