Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conditions in which Kafka Consumer (Group) triggers a rebalance

I was going through the Consumer Config for Kafka.

  • https://kafka.apache.org/documentation/#newconsumerconfigs

what are the parameters that will trigger a rebalance ?. For instance the following parameter will ?. Any other parameters which we need to change or will defaults suffice

connections.max.idle.ms Close idle connections after the number of milliseconds specified by this config. long 540000 medium

Also we have three different topics

  1. Is it a bad idea to have the same Consumer Group (Same ID) consuming from multiple topics.
  2. Assuming if the above scenario is valid (not necessarily the best practice) - if one of the topic is very light traffic, will it cause the Consumer group to rebalance.

    A follow up question - what factors affect the rebalancing and its performance.

like image 512
Durga Deep Avatar asked Aug 25 '17 19:08

Durga Deep


People also ask

What causes Kafka consumer rebalancing?

Group rebalancing Consumer group rebalancing is triggered when partitions need to be reassigned among consumers in the consumer group: A new consumer joins the group; an existing consumer leaves the group; an existing consumer changes subscription; or partitions are added to one of the subscribed topics.

What causes a consumer group rebalance?

Rebalance Triggers There are several causes for a consumer group rebalance to take place. A new consumer joins a consumer group, an existing consumer leaves a consumer group, or the broker thinks a consumer may have failed. As well as these, any other need for resources to be reassigned will trigger a rebalance.

Which actions will trigger partition rebalance for a consumer group?

If a consumer leaves the group after a controlled shutdown or crashes then all its partitions will be reassigned automatically among other consumers. In the same way, if a consumer (re)join an existing group then all partitions will be also rebalanced between the group members.


1 Answers

These condition will trigger a group rebalancing:

Number of partitions change for any of the subscribed list of topics

Topic is created or deleted

An existing member of the consumer group dies

A new member is added to an existing consumer group via the join API

Is it a bad idea to have the same Consumer Group (Same ID) consuming from multiple topics.

At least it is valid, as for good or bad, it depends on your detailed case. This is supported by the official java client api, see this method definition:

public void subscribe(Collection<String> topics,
             ConsumerRebalanceListener listener)

It accepts a collection of topics.

if one of the topic is very light traffic, will it cause the Consumer group to rebalance.

No, because this is not listed in conditions. If we just consider it from the topic aspect. only when the topic is deleted or partition counts changed, the rebalcance will happens,.

Update.

Thanks for @Hans Jespersen's comment about session and hearbeat.

this is quoted by kafka Consumer javadoc:

After subscribing to a set of topics, the consumer will automatically join the group when poll(long) is invoked. The poll API is designed to ensure consumer liveness. As long as you continue to call poll, the consumer will stay in the group and continue to receive messages from the partitions it was assigned. Underneath the covers, the poll API sends periodic heartbeats to the server; when you stop calling poll (perhaps because an exception was thrown), then no heartbeats will be sent. If a period of the configured session timeout elapses before the server has received a heartbeat, then the consumer will be kicked out of the group and its partitions will be reassigned.

And In your question, you ask what are the parameters that will trigger a rebalance

In this case, there are two configs has relation with the rebalance. It is session.timeout.ms and max.poll.records. Its means is obvious.

And from this, We also could learn that it is a bad practice to do a lot work between the poll, overhead work maybe block the heartbeat and cause session timeout.

like image 74
GuangshengZuo Avatar answered Nov 14 '22 13:11

GuangshengZuo