Say, I want to check the offset of the first and last message in Kafka for a particular partition. My idea was to use the <code>assign(…)</code> method along with the <code>seekToBeginning(…)</code> and <code>seekToEnd(…)</code>. Unfortunately, this doesn't work. If I set <code>AUTO_OFFSET_RESET_CONFIG</code> to <code>"latest"</code>, the <code>seekToBeginning(…)</code> has no effect; if I set it to <code>"earliest"</code>, <code>seekToEnd(…)</code> doesn't work. It seems that the only thing that matters for my consumer is <code>AUTO_OFFSET_RESET_CONFIG</code>. I've seen a similar topic but the problem dealt with the <code>subscribe()</code>, not with the <code>assign()</code> method. The proposed solution was to implement the <code>ConsumerRebalanceListner</code> and pass it as a parameter to the <code>subscribe()</code> method. Unfortunately, the <code>assign()</code> method has only one signature and can only take a list of topic partitions. The question is: Is it possible to use the <code>seekToBeginning()</code> or <code>seekToEnd()</code> with the <code>assign()</code> method. If yes, how? If no, why? A relevant fragment of my code: <pre class="prettyprint lang-java prettyprint-override"><code>KafkaConsumer<String, ProtoMeasurement> consumer = createConsumer(); TopicPartition zeroP = new TopicPartition(TOPIC, 1); List<TopicPartition> partitions = Collections.singletonList(zeroP); consumer.assign(partitions); consumer.poll(Duration.ofSeconds(1)); consumer.seekToBeginning(partitions); long currOffsetPos = consumer.position(zeroP); LOGGER.info("Current offset {}.", currOffsetPos); ConsumerRecords<String, ProtoMeasurement> records = consumer.poll(Duration.ofMillis(100)); // ... </code></pre> The logger prints the offset n, which is the biggest (latest) offset of the considered topic.

I have noticed that this behavior is buggy and inconsistent in the MockConsumer. The docs say that they is lazy, but will trigger after a position() call. But that is not true for the MockConsumer. In particular, I found that it works for MockConsumer between roughly 1.0 and 2.2.2, and is broken after 2.3.0 In its place, I have chosen to do the following instead, which works consistently in the MockConsumer and the real one: <pre class="prettyprint"><code>// consistently working seed to beginning consumer.beginningOffsets(partitions).forEach(consumer::seek); // consistently working seed to end consumer.endOffsets(partitions).forEach(consumer::seek); </code></pre> This is a bit more dangerous if there are threads concurrently calling poll, but it works great in my case, where I just want manual control over the offset position when the application begins polling.

Why don't Kafka's seekToBeginning and seekToEnd work with assign?

Tags:

java

apache-kafka

kafka-consumer-api

Say, I want to check the offset of the first and last message in Kafka for a particular partition. My idea was to use the assign(…) method along with the seekToBeginning(…) and seekToEnd(…). Unfortunately, this doesn't work.

If I set AUTO_OFFSET_RESET_CONFIG to "latest", the seekToBeginning(…) has no effect; if I set it to "earliest", seekToEnd(…) doesn't work. It seems that the only thing that matters for my consumer is AUTO_OFFSET_RESET_CONFIG.

I've seen a similar topic but the problem dealt with the subscribe(), not with the assign() method. The proposed solution was to implement the ConsumerRebalanceListner and pass it as a parameter to the subscribe() method. Unfortunately, the assign() method has only one signature and can only take a list of topic partitions.

The question is: Is it possible to use the seekToBeginning() or seekToEnd() with the assign() method. If yes, how? If no, why?

A relevant fragment of my code:

KafkaConsumer<String, ProtoMeasurement> consumer = createConsumer();
TopicPartition zeroP = new TopicPartition(TOPIC, 1);
List<TopicPartition> partitions = Collections.singletonList(zeroP);

consumer.assign(partitions);
consumer.poll(Duration.ofSeconds(1));
consumer.seekToBeginning(partitions);
long currOffsetPos = consumer.position(zeroP);
LOGGER.info("Current offset {}.", currOffsetPos);
ConsumerRecords<String, ProtoMeasurement> records = consumer.poll(Duration.ofMillis(100));
// ...

The logger prints the offset n, which is the biggest (latest) offset of the considered topic.

436

asked Oct 25 '19 10:10

RobertSzooba

1 Answers

I have noticed that this behavior is buggy and inconsistent in the MockConsumer. The docs say that they is lazy, but will trigger after a position() call. But that is not true for the MockConsumer. In particular, I found that it works for MockConsumer between roughly 1.0 and 2.2.2, and is broken after 2.3.0

In its place, I have chosen to do the following instead, which works consistently in the MockConsumer and the real one:

// consistently working seed to beginning
consumer.beginningOffsets(partitions).forEach(consumer::seek);
// consistently working seed to end
consumer.endOffsets(partitions).forEach(consumer::seek);

This is a bit more dangerous if there are threads concurrently calling poll, but it works great in my case, where I just want manual control over the offset position when the application begins polling.

answered Nov 14 '22 23:11

Scott Carey

Related questions
                            
                                Lagom service consuming input from Kafka
                            
                                Spring CrudRepository deleteAll() does nothing
                            
                                Does oracle undo sequence increase after transaction rollback?
                            
                                How to add local spring boot project as a dependency in another spring boot project
                            
                                Spring Boot not able to update sharded collection on azure cosmos db(MongoDb)
                            
                                How to create a standalone application using JavaFX 11 (Modular)? [duplicate]
                            
                                JDK 11 and JavaFX 11: build for ARM (Tinker Board) not running (hash mismatch)
                            
                                Compilation error: duplicate class with MapStruct in IntelliJ IDEA 2019.1
                            
                                Is there a java8 standard library class that means "possibly with exception" in the same way as java.util.Optional means "possibly null"?
                            
                                Java AsyncHttpClient: broken file while writing from LazyResponseBodyPart to AsynchronousFileChannel
                            
                                Cannot create product price levels when creating price level
                            
                                Does managed languages lock flush and reload variables of native libraries?
                            
                                How to detect if user drag PIP window (drag down to dismiss)?
                            
                                Apache Flink - Unable to use local Kinesis for FlinkKinesisConsumer
                            
                                how to implement alt+tab like feature using java?
                            
                                How to chain completable futures
                            
                                How to unit test a kafka stream application that uses session window
                            
                                Android app is supported by 0 devices suddenly
                            
                                Dependency on a module on a multimodule gradle project
                            
                                Isolate the instantiation of an annotation

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With