Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does kafka consumer auto commit work?

Tags:

I am reading this one:

Automatic Commit The easiest way to commit offsets is to allow the consumer to do it for you. If you configure enable.auto.commit=true, then every five seconds the consumer will commit the largest offset your client received from poll(). The five-second interval is the default and is controlled by setting auto.commit.interval.ms. Just like everything else in the consumer, the automatic commits are driven by the poll loop. Whenever you poll, the consumer checks if it is time to commit, and if it is, it will commit the offsets it returned in the last poll.

Maybe issue that my English is not good but I do not fully understood this description.

Let's say I use auto-commit with default interval - 5 sec and poll happens every 7 sec. At this case, a commit will happen every 5 sec or every 7 sec?

Can you clarify behaviour if poll will happen every 3 sec? Will commit happen every 5 sec or every 6 sec?
I have read this one:

Auto commits: You can set auto.commit to true and set the auto.commit.interval.ms property with a value in milliseconds. Once you've enabled this, the Kafka consumer will commit the offset of the last message received in response to its poll() call. The poll() call is issued in the background at the set auto.commit.interval.ms.

And it contradict the answer.

Can you explain this stuff in details.

let say I have diagramm like this:

0 sec - poll
4 sec - poll
8 sec - poll

When does offset will be committed and when which one?

like image 952
gstackoverflow Avatar asked Oct 03 '17 14:10

gstackoverflow


People also ask

How does Kafka Auto commit work?

By default, the consumer is configured to auto-commit offsets. Using auto-commit gives you “at least once” delivery: Kafka guarantees that no messages will be missed, but duplicates are possible. Auto-commit basically works as a cron with a period set through the auto.commit.interval.ms configuration property.

How does a consumer commit offsets in Kafka?

The Kafka consumer commits the offset periodically when polling batches, as described above. This strategy works well if the message processing is synchronous and failures handled gracefully. Be aware that starting Quarkus 1.9, auto commit is disabled by default. So you need to explicitly enable it.

Does Kafka console consumer commit?

Whenever you poll, the consumer checks if it is time to commit, and if it is, it will commit the offsets it returned in the last poll. --max-messages: The maximum number of messages to consume before exiting. If not set, consumption is continual.

What happens if Kafka message is not committed?

If a consumer fails before a commit, all messages after the last commit are received from Kafka and processed again. However, this retry might result in duplicates, as some message from the last poll() call might have been processed but the failure happened right before the auto commit call.


2 Answers

The auto-commit check is called in every poll and it checks that the time elapsed is greater than the configured time. If so, the offset is committed.

In case the commit interval is 5 seconds and poll is happening in 7 seconds, the commit will happen after 7 seconds only.

like image 71
Liju John Avatar answered Sep 19 '22 22:09

Liju John


It would try to autocommit ASAP after poll completes. You can have a look on the source code of consumer coordinator, which has set of local fields defined on class level to understand whether autocommit is enabled, what is the interval, and what is the next deadline to perform autocommit.

https://github.com/apache/kafka/blob/10cd98cc894b88c5d1e24fc54c66361ad9914df2/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L625

And one of the places within poll that perform a call to do storage https://github.com/apache/kafka/blob/10cd98cc894b88c5d1e24fc54c66361ad9914df2/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L279

That being said for instance poll executed every 7 seconds, and autocommit set to 5:

0 - poll, + set deadline to 5th second

7 - poll + commit due to deadline, update deadline to 7+5=12

14 - poll + commit due to deadline, update deadline to 12+5=17

However if polling set to every 3 seconds, and autocommit is set to 5:

0 - poll, + set deadline to 5th second

3 - poll, no commit

6 - poll + commit due to deadline, update deadline to 6+5=11

like image 30
zubrabubra Avatar answered Sep 21 '22 22:09

zubrabubra