Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I scale Kafka consumers?

Tags:

I'm reading the Kafka documentation and noticed the following line:

Note however that there cannot be more consumer instances in a consumer group than partitions.

Hmm. How can I auto-scale this?

For example let's say I have a messaging system with hi/lo priorities, so I create a topic for messages and partitions for hi and lo priority messages.

If this was RabbitMQ, I'd have an auto-scalable group of consumers assigned to each partition, like this:

enter image description here

If I understand the Kafka model I can't have >1 consumer per partition in a consumer group, so that picture doesn't work for Kafka, right?

Ok, so what about >1 consumer groups like this:

enter image description here

That get's around Kafka's limitation but... If I understand how this works both consumer groups would be pulling from a partition, for example msg.hi, with their own offsets so neither would know about the other--meaning messages would likely be delivered twice!

How can I achieve the capability I had in the Rabbit design w/Kafka and still maintain the "queue-ness" of the behavior (I don't want to send a message twice)? What am I missing?

like image 959
Greg Avatar asked Mar 24 '16 15:03

Greg


People also ask

How do you make Kafka scalable?

To scale the Kafka connector side you have to increase the number of tasks, ensuring that there are sufficient partitions. In theory, you can set the number of partitions to a large number initially, but in practice, this is a bad idea.

How do I scale a Kafka partition?

To scale out Kafka, you provision extra storage, update the deployment configuration, and then perform a helm upgrade. To redistribute the topic partitions over all available brokers, you then reassign topic partitions.

How can I speed up Kafka consumer?

Increasing the number of partitions and the number of brokers in a cluster will lead to increased parallelism of message consumption, which in turn improves the throughput of a Kafka cluster; however, the time required to replicate data across replica sets will also increase.

Is Kafka auto scalable?

Kafka became a standard for highly loaded streaming systems. It's horizontally scalable, very fast and reliable.


1 Answers

Topic is made up of partitions. Partitions decide the max number of consumers you can have in a group. enter image description here

In the above picture, we have only one consumer. It can read all the messages from all the partitions. When you increase the number of consumers in the group, partition reassignment happens and instead of consumer 1 reading all the messages from all the partitions, consumer 2 could share some of the load with consumer 1 as shown below. enter image description here What happens If I have more number of consumers than the number of partitions.? Each consumer would be assigned 1 partition. Any additional consumers in the group will be sitting idle unless you increase the number of partitions for a Topic.

enter image description here

So, we need to choose the partitions accordingly. That decides the max number of consumers in the group. Changing the partition for an existing topic is really not recommended as It could cause issues. That is, Lets assume a producer producing names into a topic where we have 3 partitions. All the names starting with A-I go to Partition 1, J-R in partition 2 and S-Z in partition 3. Lets also assume that we have already produced 1 million messages. Now if you suddenly increase the number of partitions to 5 from 3, It will create a different A-Z range now. That is, A-F in Partition 1, G-K in partition 2, L-Q in partition 3, R-U in partition 4 and V-Z in partition 5. Do you get it? It kind of affects the order of the messages we had before! So you need to be aware of this. If this could be a problem, then we need to choose the partition accordingly upfront.


More info is here - http://www.vinsguru.com/kafka-scaling-consumers-out-for-a-consumer-group/

like image 184
vins Avatar answered Oct 10 '22 12:10

vins