Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka Stream offset reset to zero for consumer group

I have written Kafka Streaming app that just filters rows based on some condition and load it to MongoDB.

The streaming process is working fine but due to some flaw in my code, I want to reprocess whole data again.

One way is kill streaming app , change consumer group id , delete data from mongo and rerun the app.

How to achieve this scenario without changing consumer group id.

<< I am using Kafka 0.10 version >>

Many Thanks Pari

like image 887
Pari Avatar asked Jul 20 '16 11:07

Pari


1 Answers

Apache Kafka 0.10.0.1 (which was released in August, while the original question was asked in July) ships with a new Application Reset Tool for Kafka Streams, which is an easier and better/cleaner solution than simply renaming application.id.

You can execute the tool via the script bin/kafka-streams-application-reset.sh, which will also print a usage/help message.

Example:

# Run this only after ALL application instances were stopped!
$ bin/kafka-streams-application-reset --application-id my-streams-app \
                                      --input-topics my-input-topic \
                                      --intermediate-topics rekeyed-topic \
                                      --bootstrap-servers brokerHost:9092 \
                                      --zookeeper zookeeperHost:2181

That said, I'd recommend to read Data Reprocessing with Kafka Streams: Resetting a Streams Application, which the aforementioned Matthias J. Sax wrote, for further details. That article also explains why simply renaming application.id (which was the workaround until now) isn't the best idea.

like image 83
Michael G. Noll Avatar answered Sep 22 '22 12:09

Michael G. Noll