I have written Kafka Streaming app that just filters rows based on some condition and load it to MongoDB.
The streaming process is working fine but due to some flaw in my code, I want to reprocess whole data again.
One way is kill streaming app , change consumer group id , delete data from mongo and rerun the app.
How to achieve this scenario without changing consumer group id.
<< I am using Kafka 0.10 version >>
Many Thanks Pari
Apache Kafka 0.10.0.1 (which was released in August, while the original question was asked in July) ships with a new Application Reset Tool for Kafka Streams, which is an easier and better/cleaner solution than simply renaming application.id
.
You can execute the tool via the script bin/kafka-streams-application-reset.sh
, which will also print a usage/help message.
Example:
# Run this only after ALL application instances were stopped!
$ bin/kafka-streams-application-reset --application-id my-streams-app \
--input-topics my-input-topic \
--intermediate-topics rekeyed-topic \
--bootstrap-servers brokerHost:9092 \
--zookeeper zookeeperHost:2181
That said, I'd recommend to read Data Reprocessing with Kafka Streams: Resetting a Streams Application, which the aforementioned Matthias J. Sax wrote, for further details. That article also explains why simply renaming application.id
(which was the workaround until now) isn't the best idea.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With