Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka Java consumer marked as dead for group

I'm using a Java consumer to consume messages from a topic (kafka version 0.10.0.1) which works fine if I run them outside of docker container. When I execute them in docker container, however, then the groups are marked as dead with message

Marking the coordinator local.kafka.com:9092 (id: 2147483647 rack: null) dead for group my-group

My consumer configuration are as follows:-

metadata.max.age.ms = 300000
partition.assignment.strategy =[org.apache.kafka.clients.consumer.RangeAssignor]
reconnect.backoff.ms = 50
sasl.kerberos.ticket.renew.window.factor = 0.8
max.partition.fetch.bytes = 1048576
bootstrap.servers = [192.168.115.128:9092, 192.168.115.128:9093]
ssl.keystore.type = JKS
enable.auto.commit = true
sasl.mechanism = GSSAPI
interceptor.classes = null
exclude.internal.topics = true
ssl.truststore.password = null
client.id = consumer-1
ssl.endpoint.identification.algorithm = null
max.poll.records = 2147483647
check.crcs = true
request.timeout.ms = 40000
heartbeat.interval.ms = 3000
auto.commit.interval.ms = 5000
receive.buffer.bytes = 65536
ssl.truststore.type = JKS
ssl.truststore.location = null
ssl.keystore.password = null
fetch.min.bytes = 1
send.buffer.bytes = 131072
value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
group.id = my-group
retry.backoff.ms = 100
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
ssl.trustmanager.algorithm = PKIX
ssl.key.password = null
fetch.max.wait.ms = 500
sasl.kerberos.min.time.before.relogin = 60000
connections.max.idle.ms = 540000
session.timeout.ms = 30000
metrics.num.samples = 2
key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
ssl.protocol = TLS
ssl.provider = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.keystore.location = null
ssl.cipher.suites = null
security.protocol = PLAINTEXT
ssl.keymanager.algorithm = SunX509
metrics.sample.window.ms = 30000
auto.offset.reset = earliest

The auto.commit property is set to false and the poll.timeout is set to 10000. Can somebody please point out where I am mistaken?

like image 530
Apollo Avatar asked Oct 12 '16 06:10

Apollo


1 Answers

It might be your advertised.listener (broker config) or lack thereof passing the consumer an incorrect URL back after the first discovery call from boostrap.servers in your consumer.

This can cause the consumer to use an incorrect URL for the rest of the RPC calls.

like image 110
PragmaticProgrammer Avatar answered Sep 25 '22 09:09

PragmaticProgrammer