Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Partitioned key space for StackExchange Redis

When developing a component that use Redis, I've found it a good pattern to prefix all keys used by that component so that it does not interfere other components.

Examples:

  • A component managing users might use keys prefixed by user: and a component managing a log might use keys prefixed by log:.

  • In a multi-tenancy system I want each customer to use a separate key space in Redis to ensure that their data do not interfere. The prefix would then be something like customer:<id>: for all keys related to a specific customer.

Using Redis is still new stuff for me. My first idea for this partitioning pattern was to use separate database identifiers for each partition. However, that seems to be a bad idea because the number of databases is limited and it seems to be a feature that is about to be deprecated.

An alternative to this would be to let each component get an IDatabase instance and a RedisKey that it shall use to prefix all keys. (I'm using StackExchange.Redis)

I've been looking for an IDatabase wrapper that automatically prefix all keys so that components can use the IDatabase interface as-is without having to worry about its keyspace. I didn't find anything though.

So my question is: What is a recommended way to work with partitioned key spaces on top of StackExchange Redis?

I'm now thinking about implementing my own IDatabase wrapper that would prefix all keys. I think most methods would just forward their calls to the inner IDatabase instance. However, some methods would require a bit more work: For example SORT and RANDOMKEY.

like image 917
Mårten Wikström Avatar asked Sep 25 '14 06:09

Mårten Wikström


2 Answers

I've created an IDatabase wrapper now that provides a key space partitioning.

The wrapper is created by using an extension method to IDatabase

    ConnectionMultiplexer multiplexer = ConnectionMultiplexer.Connect("localhost");
    IDatabase fullDatabase = multiplexer.GetDatabase();
    IDatabase partitioned = fullDatabase.GetKeyspacePartition("my-partition");

Almost all of the methods in the partitioned wrapper have the same structure:

public bool SetAdd(RedisKey key, RedisValue value, CommandFlags flags = CommandFlags.None)
{
    return this.Inner.SetAdd(this.ToInner(key), value, flags);
}

They simply forward the invocation to the inner database and prepend the key space prefix to any RedisKey arguments before passing them on.

The CreateBatch and CreateTransaction methods simply creates wrappers for those interfaces, but with the same base wrapper class (as most methods to wrap are defined by IDatabaseAsync).

The KeyRandomAsync and KeyRandom methods are not supported. Invocations will throw a NotSupportedException. This is not a concern for me, and to quote @Marc Gravell:

I can't think of any sane way of achieving that, but I suspect NotSupportedException("RANDOMKEY is not supported when a key-prefix is specified") is entirely reasonable (this isn't a commonly used command anyway)

I have not yet implemented ScriptEvaluate and ScriptEvaluateAsync because it is unclear to me how I should handle the RedisResult return value. The input parameters to these methods accept RedisKey which should be prefixed, but the script itself could return keys and in that case I think it would make (most) sense to unprefix those keys. For the time being, those methods will throw a NotImplementedException...

The sort methods (Sort, SortAsync, SortAndStore and SortAndStoreAsync) have special handling for the by and get parameters. These are prefixed as normal unless they have one of the special values: nosort for by and # for get.

Finally, to allow prefixing ITransaction.AddCondition I had to use a bit reflection:

internal static class ConditionHelper
{
    public static Condition Rewrite(this Condition outer, Func<RedisKey, RedisKey> rewriteFunc)
    {
        ThrowIf.ArgNull(outer, "outer");
        ThrowIf.ArgNull(rewriteFunc, "rewriteFunc");

        Type conditionType = outer.GetType();
        object inner = FormatterServices.GetUninitializedObject(conditionType);

        foreach (FieldInfo field in conditionType.GetFields(BindingFlags.NonPublic | BindingFlags.Instance))
        {
            if (field.FieldType == typeof(RedisKey))
            {
                field.SetValue(inner, rewriteFunc((RedisKey)field.GetValue(outer)));
            }
            else
            {
                field.SetValue(inner, field.GetValue(outer));
            }
        }

        return (Condition)inner;
    }
}

This helper is used by the wrapper like this:

internal Condition ToInner(Condition outer)
{
    if (outer == null)
    {
        return outer;
    }
    else
    {
        return outer.Rewrite(this.ToInner);
    }
}

There are several other ToInner methods for different kind of parameters that contain RedisKey but they all more or less end up calling:

internal RedisKey ToInner(RedisKey outer)
{
    return this.Prefix + outer;
}

I have now created a pull request for this:

https://github.com/StackExchange/StackExchange.Redis/pull/92

The extension method is now called WithKeyPrefix and the reflection hack for rewriting conditions is no longer needed as the new code have access to the internals of Condition classes.

like image 160
Mårten Wikström Avatar answered Sep 20 '22 00:09

Mårten Wikström


Intriguing suggestion. Note that redis already offers a simple isolation mechanism by way of database numbers, for example:

// note: default database is 0
var logdb = muxer.GetDatabase(1);
var userdb = muxer.GetDatabase(2);

StackExchange.Redis will handle all the work to issue commands to the correct databases - i.e. commands issued via logdb will be issued against database 1.

Advantages:

  • inbuilt
  • works with all clients
  • provides complete keyspace isolation
  • doesn't require additional per-key space for the prefixes
  • works with KEYS, SCAN, FLUSHDB, RANDOMKEY, SORT, etc
  • you get high-level per-db keyspace metrics via INFO

Disadvantages:

  • not supported on redis-cluster
  • not supported via intermediaries like twemproxy

Note:

  • the number of databases is a configuration option; IIRC it defaults to 16 (numbers 0-15), but can be tweaked in your configuration file via:

    databases 400 # moar databases!!!
    

This is actually how we (Stack Overflow) use redis with multi-tenancy; database 0 is "global", 1 is "stackoverflow", etc. It should also be clear that if required, it is then a fairly simple thing to migrate an entire database to a different node using SCAN and MIGRATE (or more likely: SCAN, DUMP, PTTL and RESTORE - to avoid blocking).

Since database partitioning is not supported in redis-cluster, there may be a valid scenario here, but it should also be noted that redis nodes are easy to spin up, so another valid option is simply: use different redis groups for each (different port numbers, etc) - which would also have the advantage of allowing genuine concurrency between nodes (CPU isolation).


However, what you propose is not unreasonable; there is actually "prior" here... again, largely linked to how we (Stack Overflow) use redis: while databases work fine for isolating keys, no isolation is currently provided by redis for channels (pub/sub). Because of this, StackExchange.Redis actually includes a ChannelPrefix option on ConfigurationOptions, that allows you to specify a prefix that is automatically added during PUBLISH and removed when receiving notifications. So if your ChannelPrefix is foo:, and you publish and event bar, the actual event is published to the channel foo:bar; likewise: any callback you have only sees bar. It could be that this is something that is viable for databases too, but to emphasize: at the moment this configuration option is at the multiplexer level - not the individual ISubscriber. To be comparable to the scenario you present, this would need to be at the IDatabase level.

Possible, but a decent amount of work. If possible, I would recommend investigating the option of simply using database numbers...

like image 20
Marc Gravell Avatar answered Sep 20 '22 00:09

Marc Gravell