Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redis Cluster vs ZeroMQ in Pub/Sub, for horizontally scaled distributed systems

If I were to design a huge distributed system whose throughput should scale linearly with the number of subscribers and number of channels in the system, which would be better ?

1) Redis Cluster (only for Redis 3.0 alpha, if its in cluster mode, you can publish in one node and subscribe in another completely different node, and the messages will propagate and reach you). The complexity of Publish is O(N+M), where N is the number of subscribed clients and M is the number of subscribed patterns in the system, but how does it scale when in a Redis Cluster ? I accept educated guesses on this.

2) ZeroMQ since 3.x, it does server-side filtering, so it also has some time complexity there, but I have not seen anything about it in the documentation. If I wanted to scale it, I could just have lots of servers publishing to whatever channels, and each subscriber would connect to all the servers, and subscribe for the desired channel. That seems nice.

So which of those is better for horizontal scaling of a huge publisher system ? What are other solutions I should look into ? Remember, I want to minimize latency and throughput, but being able to scale horizontally.

like image 202
João Pinto Jerónimo Avatar asked Dec 07 '12 15:12

João Pinto Jerónimo


2 Answers

You want to minimize latency, I guess. The number of channels is irrelevant. The key factors are the number of publishers and number of subscribers, message size, number of messages per second per publisher, number of messages received by each subscriber, roughly. ZeroMQ can do several million small messages per second from one node to another; your bottleneck will be the network long before it's the software. Most high-volume pubsub architectures therefore use something like PGM multicast, which ZeroMQ supports.

like image 73
Pieter Hintjens Avatar answered Oct 23 '22 22:10

Pieter Hintjens


In Redis, like in ZeroMQ, the bottleneck will be the network. Redis can reach millions of messages per second, at least as much if not more than ZeroMQ.

You should be aware that the current implementation of Redis Cluster distributes PUBLISH messages across all cluster nodes using the inter-node bus. This approach assumes that PUBLISH is extremely cheap on Redis (as explained in this issue on Github).

However, there is a small overhead involved which is inter-node communication. As you scale up this overhead will be more significant. There is another Redis Cluster implementation I'm aware of - please note it's a commercial one - in which channels or patterns are distributed across cluster nodes in a similar fashion to the way Redis keys are distributed. At least according to the vendor, this should save the overhead of inter-node communication and increase performance, but I have not benchmarked it myself.

like image 37
Lea Krause Avatar answered Oct 23 '22 22:10

Lea Krause