If I have a <code>enable.auto.commit=false</code> and I call <code>consumer.poll()</code> without calling <code>consumer.commitAsync()</code> after, why does <code>consumer.poll()</code> return new records the next time it's called? Since I did not commit my offset, I would expect <code>poll()</code> would return the latest offset which should be the same records again. I'm asking because I'm trying to handle failure scenarios during my processing. I was hoping without committing the offset, the <code>poll()</code> would return the same records again so I can re-process those failed records again. <pre class="prettyprint"><code>public class MyConsumer implements Runnable { @Override public void run() { while (true) { ConsumerRecords<String, LogLine> records = consumer.poll(Long.MAX_VALUE); for (ConsumerRecord record : records) { try { //process record consumer.commitAsync(); } catch (Exception e) { } /** If exception happens above, I was expecting poll to return new records so I can re-process the record that caused the exception. **/ } } } } </code></pre>

The starting offset of a poll is not decided by the broker but by the consumer. The consumer tracks the last received offset and asks for the following bunch of messages during the next poll. Offset commits come into play when a consumer stops or fails and another instance that is not aware of the last consumed offset picks up consumption of a partition. KafkaConsumer has pretty extensive Javadoc that is well worth a read.

Consumer.poll() returns new records even without committing offsets?

Tags:

apache-kafka

kafka-consumer-api

If I have a enable.auto.commit=false and I call consumer.poll() without calling consumer.commitAsync() after, why does consumer.poll() return new records the next time it's called?

Since I did not commit my offset, I would expect poll() would return the latest offset which should be the same records again.

I'm asking because I'm trying to handle failure scenarios during my processing. I was hoping without committing the offset, the poll() would return the same records again so I can re-process those failed records again.

public class MyConsumer implements Runnable {
    @Override
    public void run() {
        while (true) {
            ConsumerRecords<String, LogLine> records = consumer.poll(Long.MAX_VALUE);
            for (ConsumerRecord record : records) {
                try {
                   //process record
                   consumer.commitAsync();
                } catch (Exception e) {
                }
                /**
                If exception happens above, I was expecting poll to return new records so I can re-process the record that caused the exception. 
                **/
            }

        }
    }
}

978

asked Apr 19 '17 17:04

Glide

2 Answers

The starting offset of a poll is not decided by the broker but by the consumer. The consumer tracks the last received offset and asks for the following bunch of messages during the next poll.

Offset commits come into play when a consumer stops or fails and another instance that is not aware of the last consumed offset picks up consumption of a partition.

KafkaConsumer has pretty extensive Javadoc that is well worth a read.

137

answered Sep 30 '22 18:09

ftr

Consumer will read from last commit offset if it get re balanced (means if any consumer leave the group or new consumer added) so handling de-duplication does not come straight forward in kafka so you have to store the last process offset in external store and when rebalance happens or app restart you should seek to that offset and start processing or you should check against some unique key in message against DB to find is dublicate

answered Sep 30 '22 18:09

kapil07

Related questions
                            
                                Kafka new producer timeout
                            
                                Kafka Connect - Failed to flush, timed out while waiting for producer to flush outstanding messages
                            
                                Kafka INVALID_FETCH_SESSION_EPOCH
                            
                                How to implement contract testing when kafka is involved in microservice architecture?
                            
                                Kafka Connect Offsets. Get/Set?
                            
                                Failed to resolve 'kafka:9092': Name or service not known - docker / php-rdkafka
                            
                                How can I gracefully handle a Kafka outage?
                            
                                How can I find kafka config file?
                            
                                Redis vs Kafka vs RabbitMQ for 1MB messages
                            
                                Spring Kafka integration test Error while writing to highwatermark file
                            
                                Kafka Log Compaction not starting
                            
                                Kafka Producer NetworkException and Timeout Exceptions
                            
                                kafka NoClassDefFoundError kafka/Kafka
                            
                                How Kafka distributes the topic partitions among the brokers
                            
                                Kafka Tool can show the actual string instead of the regular hexadecimal format
                            
                                A default binder has been requested, but there are no binders available for 'org.springframework.cloud.stream.messaging.DirectWithAttributesChannel'
                            
                                Kafka Connect - How to delete a connector
                            
                                How Kafka broadcast to many Consumer Groups
                            
                                How to use Consumer API of Kafka 0.8.2?
                            
                                Reproducing UnknownTopicOrPartitionException: This server does not host this topic-partition

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With