Let's say that I have 10 partitions for a given topic in Kafka. What would my options be to automatically load balance these 10 partitions between consumers?
I have read this post https://stackoverflow.com/a/28580363/317384 but I'm not sure it covers what I'm looking for, or maybe I'm just not getting it.
If I spin up a worker with one consumer for each partition, all work would be consumed by that worker.
But what happens if I spin up another instance of the same worker elsewhere? Will the client libraries/Kafka somehow detect this and re-balance the load between the two workers so that some of the active consumers on worker1 are now idle and the same consumers on worker2 becomes active?
I would like to be able to add and remove workers on demand, and spread the load across those, is that possible?
e.g. from this:
to this:
During a rebalance event, every consumer that's still in communication with the group coordinator must revoke then regain its partitions, for all partitions within its assignment. More partitions to manage means more time to wait as all the consumers within the group take the time to manage those relationships.
Kafka Rebalance happens when a new consumer is either added (joined) into the consumer group or removed (left). It becomes dramatic during application service deployment rollout, as multiple instances restarted at the same time, and rebalance latency significantly increasing.
You can't have multiple consumers that belong to the same group in one thread and you can't have multiple threads safely use the same consumer. One consumer per thread is the rule. To run multiple consumers in the same group in one application, you will need to run each in its own thread.
Kafka consumers are part of consumer groups. A group has one or more consumers in it. Each partition gets assigned to one consumer. And partitions are how Kafka scales out. If you have more consumers than partitions, then some of your consumers will be idle. If you have more partitions than consumers, more than one partition may get assigned to a single consumer.
When a new consumer joins, a rebalance occurs, and the new consumer is assigned some partitions previously assigned to other consumers. In your case, if there were 10 partitions all being consumed by one consumer, and another consumer joins, there'll be a rebalance, and afterwards, there'll be (typically) five partitions per consumer.
It's worth noting that during a rebalance, the consumer group "pauses". A similar thing happens when consumers gracefully leave, or the leader detects that a consumer has left.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With