I am trying to insert a large(-ish) number of elements in the shortest time possible and I tried these two alternatives:
1) Pipelining:
List<Task> addTasks = new List<Task>(); for (int i = 0; i < table.Rows.Count; i++) { DataRow row = table.Rows[i]; Task<bool> addAsync = redisDB.SetAddAsync(string.Format(keyFormat, row.Field<int>("Id")), row.Field<int>("Value")); addTasks.Add(addAsync); } Task[] tasks = addTasks.ToArray(); Task.WaitAll(tasks);
2) Batching:
List<Task> addTasks = new List<Task>(); IBatch batch = redisDB.CreateBatch(); for (int i = 0; i < table.Rows.Count; i++) { DataRow row = table.Rows[i]; Task<bool> addAsync = batch.SetAddAsync(string.Format(keyFormat, row.Field<int>("Id")), row.Field<int>("Value")); addTasks.Add(addAsync); } batch.Execute(); Task[] tasks = addTasks.ToArray(); Task.WaitAll(tasks);
I am not noticing any significant time difference (actually I expected the batch method to be faster): for approx 250K inserts I get approx 7 sec for pipelining vs approx 8 sec for batching.
Reading from the documentation on pipelining,
"Using pipelining allows us to get both requests onto the network immediately, eliminating most of the latency. Additionally, it also helps reduce packet fragmentation: 20 requests sent individually (waiting for each response) will require at least 20 packets, but 20 requests sent in a pipeline could fit into much fewer packets (perhaps even just one)."
To me, this sounds a lot like the a batching behaviour. I wonder if behind the scenes there's any big difference between the two because at a simple check with procmon
I see almost the same number of TCP Send
s on both versions.
Redis pipelining is a technique for improving performance by issuing multiple commands at once without waiting for the response to each individual command. Pipelining is supported by most Redis clients. This document describes the problem that pipelining is designed to solve and how pipelining works in Redis.
redis is fully thread safe; the expected usage is that a single multiplexer is reused between concurrent requests etc - very parallel. Two concurrent callers do not block each other: the two requests are pipelined and the results made available to each when the come back.
Multiplexing allows for a form of implicit pipelining. Pipelining, in the Redis sense, meaning sending commands to the server without regard for the response being received. If you've ever been through a drive-through window and rattled off your entire order into the speaker, this is like pipelining.
StackExchange. Redis is a high performance general purpose redis client for . NET languages (C#, etc.). It is the logical successor to BookSleeve, and is the client developed-by (and used-by) Stack Exchange for busy sites like Stack Overflow.
Behind the scenes, SE.Redis does quite a bit of work to try to avoid packet fragmentation, so it isn't surprising that it is quite similar in your case. The main difference between batching and flat pipelining are:
multi
/exec
transaction or a Lua script)In most cases, you will do better by avoiding batching, since SE.Redis achieves most of what it does automatically when simply adding work.
As a final note; if you want to avoid local overhead, one final approach might be:
redisDB.SetAdd(string.Format(keyFormat, row.Field<int>("Id")), row.Field<int>("Value"), flags: CommandFlags.FireAndForget);
This sends everything down the wire, neither waiting for responses nor allocating incomplete Task
s to represent future values. You might want to do something like a Ping
at the end without fire-and-forget, to check the server is still talking to you. Note that using fire-and-forget does mean that you won't notice any server errors that get reported.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With