I have a Kafka stream app that has an external dependency. In the case the dependency is not available, I want to reprocess the message(s) later. I can't control the offset since the streams are doing it internally. What is the best was to accomplish this?
You would need to put the message into a store, and later put it out of the store and retry.
You can do the retry either during regular processing or by scheduling a punctuation.
Check out the docs for more details: https://docs.confluent.io/current/streams/developer-guide/processor-api.html#defining-a-stream-processor
As I understand Kafka Streams and the whole Confluent Platform architecture you shouldn’t communicate with any external resources directly from Kafka Streams application. One of the basic concept is Kafka Steams application inputs and outputs are just Kafka topics. Communication with every other external resources should be done by Kafka Connect. There is a lot different connectors made by Confluent and community, you can even write your own implementation if needed.
In this approach you don't need to implement retries on your own. Another thing is that message processing in Kafka Streams wouldn't be blocked by any long living IO operations which could negative affect on the other components of streams topology. All blocking operations and retries will be done in Kafka Connect connector which is designed for this type of operations. Connector should be fault tolerant and guarantee delivery.
Here is simple diagram from Confluent blog which shows described approach
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With