Suppose in your web application you need to do a number of redis calls to render a page, like, getting a bunch of user hashes. To speed this up you could wrap up your redis commands in a MULTI/EXEC section, thus using pipelining, so that you avoid doing many round-trips. But you also want to shard your data, because you have lots of it and/or you want to distribute writes. Then pipelining wouldn't work, because different keys would potentially live on different nodes, unless you have a clear idea of the data layout of your application and shard based on roles rather than using a hash function. So, what are the best practices to shard data across different servers without compromising performance too much due to many servers being contacted to complete a "conceptually unique" job? I believe the answer depends on the web application one is developing, and I'll eventually run some tests, but it'd be helpful to know how others have coped with the trade-offs I mentioned.
MULTI/EXEC and pipelining are two different things. You can do MULTI/EXEC without any pipelining and vice versa.
If you want to shard and pipeline at the same time, you need to group the operations to pipeline per Redis instance, and then use pipelining for each instance.
Here is a simple example using Ruby: https://gist.github.com/2587593
One way to further improve performance is to parallelize the traffic on the Redis instances once the operations have been grouped (i.e. you group the operations, you send them to all instances in parallel, you wait for the answers from all instances).
This is a bit more complex, because an asynchronous non blocking client is required. For maximum performance, C/C++ should be used on client side. This can be easily implemented by using hiredis + the event loop of your choice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With