Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between session.timeout.ms and max.poll.interval.ms for Kafka >= 0.10.1

I am unclear why we need both session.timeout.ms and max.poll.interval.ms and when would we use one or the other or both? It seems like both settings indicate the upper bound on the time the coordinator will wait to get the heartbeat from a consumer before assuming it's dead.

Also how does it behave for versions 0.10.1.0+ based on KIP-62?

like image 381
Deeps Avatar asked Sep 27 '16 16:09

Deeps


People also ask

What is Max Poll interval MS in Kafka?

Kafka requires one more thing. max.poll.interval.ms (default 5 minutes) defines the maximum time between poll invocations. If it's not met, then the consumer will leave the consumer group.

What is Kafka session timeout?

session.timeout.msThe timeout used to detect client failures when using Kafka's group management facility. The client sends periodic heartbeats to indicate its liveness to the broker.

What is default session timeout MS in Kafka?

3000 is the default value and shouldn't be changed. Start with 30000, increase if seeing frequent rebalancing because of missed heartbeats. Make sure that your request.timeout.ms is at least the recommended value of 60000 and your session.timeout.ms is at least the recommended value of 30000.

What is heartbeat interval MS?

The heartbeat.interval.ms specifies the frequency of sending heart beat signal by the consumer. So if this is 3000 ms (default), then every 3 seconds the consumer will send the heartbeat signal to the broker.


1 Answers

Before KIP-62, there is only session.timeout.ms (ie, Kafka 0.10.0 and earlier). max.poll.interval.ms is introduced via KIP-62 (part of Kafka 0.10.1).

KIP-62, decouples heartbeats from calls to poll() via a background heartbeat thread, allowing for a longer processing time (ie, time between two consecutive poll()) than heartbeat interval.

Assume processing a message takes 1 minute. If heartbeat and poll are coupled (ie, before KIP-62), you will need to set session.timeout.ms larger than 1 minute to prevent consumer to time out. However, if a consumer dies, it also takes longer than 1 minute to detect the failed consumer.

KIP-62 decouples polling and heartbeat allowing to send heartbeats between two consecutive polls. Now you have two threads running, the heartbeat thread and the processing thread and thus, KIP-62 introduced a timeout for each. session.timeout.ms is for the heartbeat thread while max.poll.interval.ms is for the processing thread.

Assume, you set session.timeout.ms=30000, thus, the consumer heartbeat thread must sent a heartbeat to the broker before this time expires. On the other hand, if processing of a single message takes 1 minutes, you can set max.poll.interval.ms larger than one minute to give the processing thread more time to process a message.

If the processing thread dies, it takes max.poll.interval.ms to detect this. However, if the whole consumer dies (and a dying processing thread most likely crashes the whole consumer including the heartbeat thread), it takes only session.timeout.ms to detect it.

The idea is, to allow for a quick detection of a failing consumer even if processing itself takes quite long.

Implemenation Detail

The new timeout max.poll.interval.ms is mainly a client side concept: if poll() is not called within max.poll.interval.ms, the heartbeat thread will detect this case and send a leave-group request to the broker. -- max.poll.interval.ms is still relevant for consumer group rebalances: if a rebalance is triggered, consumers have max.poll.interval.ms time to re-join the group by calling poll() client side which triggers a join-group request.

like image 93
Matthias J. Sax Avatar answered Nov 05 '22 13:11

Matthias J. Sax