Consumer group stuck in 'rebalancing' even though there are no consumers

Tags:

apache-kafka

apache-kafka-streams

I am using kafka version 2.4.1(recently upgraded to 2.4.1 from 2.2.0) and noticed a strange problem.

Even though application(kafka streams) is down (there is no application which is running ) but the consumer group command returns the state as rebalancing. Our application runs as kubernetes pod.

Click to copy

root@bastion-0:# ./kafka-consumer-groups --describe --group groupname --bootstrap-server kafka-0.local:9094 

Warning: Consumer group 'groupname' is rebalancing.

I have waited for some amount of time now(30 mins) and still the command reports 'rebalancing' even though application is down.

Even if i try to delete the group, it gives the following message.

Click to copy

root@bastion-0:/app/kafka_2.12-2.4.1/bin# ./kafka-consumer-groups.sh --delete --group group1  --bootstrap-server kafka.local:9094 

Error: Deletion of some consumer groups failed:
* Group 'group1' could not be deleted due to: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.GroupNotEmptyException: The group is not empty.
root@bastion-0:/app/kafka_2.12-2.4.1/bin# ./kafka-consumer-groups.sh --delete --group group2  --bootstrap-server kafka.local:9094 

Error: Deletion of some consumer groups failed:
* Group 'group2' could not be deleted due to: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.GroupNotEmptyException: The group is not empty.

When i look at the group members, there are members listed even though application is NOT running. Is it because of new rebalance protocol(cooperative rebalance) ?

From where does ./kafka-consumer-groups reads the group membership information. Does it save the member information if the application is down ?

Update:

I brought up the application with a different group name and it came up fine. I can describe the group also. Even then the old group is in 'rebalancing' state.

New Update Also, i found that group coordinator for all the groups was one of the node in kafka cluster and when i rebooted that node, the problem went away.

Question:

Where is group metadata stored ? Can be problem be related to corrupted zookeeper ?

470

asked Apr 30 '20 14:04

SunilS

1 Answers

This was raised as bug here issues.apache.org/jira/browse/KAFKA-9935 with duplicate https://issues.apache.org/jira/browse/KAFKA-9752

This now appears to be fixed since March for versions 2.2.3, 2.3.2, 2.4.2 and 2.5 and above so make sure to use an up to date version.

146

answered Oct 11 '22 10:10

Dennis Jaheruddin

Related questions
                            
                                When to use ConcurrentKafkaListenerContainerFactory?
                            
                                Kafka Avro Consumer with Decoder issues
                            
                                Kafka Streams - Send on different topics depending on Streams Data
                            
                                Kafka offset management: enable.auto.commit vs enable.auto.offset.store
                            
                                "kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection" ONLY DURING LISTING TOPICS
                            
                                KafkaUtils class not found in Spark streaming
                            
                                No current assignment for partition occurs even after poll in Kafka
                            
                                Spring Kafka: Multiple Listeners for different objects within an ApplicationContext
                            
                                Kafka - is it possible to alter Topic's partition count while keeping the change transparent to Producers and Consumers?
                            
                                how to get the all messages in a topic from kafka server
                            
                                kafka loses all topics on reboot
                            
                                Need to understand kafka broker property "log.flush.interval.messages"
                            
                                Kafka Multi Node setup "Unreasonable length" in Zookeeper logs
                            
                                Containerized Kafka client errors when producing messages to the host Kafka server
                            
                                Performance Metrics for Avro vs Protobuf
                            
                                What does "Broker transport failure" mean in kafka?
                            
                                Zookeeper having KeeperException but Kafka able to create topics and produce/consume
                            
                                Kafka - why new topic partition leader is not elected?
                            
                                Apache Kafka loses some consumer offsets when when I bounce a broker

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With