Handling bad messages using Kafka's Streams API

Tags:

I have a basic stream processing flow which looks like

master topic -> my processing in a mapper/filter -> output topics

and I am wondering about the best way to handle "bad messages". This could potentially be things like messages that I can't deserialize properly, or perhaps the processing/filtering logic fails in some unexpected way (I have no external dependencies so there should be no transient errors of that sort).

I was considering wrapping all my processing/filtering code in a try catch and if an exception was raised then routing to an "error topic". Then I can study the message and modify it or fix my code as appropriate and then replay it on to master. If I let any exceptions propagate, the stream seems to get jammed and no more messages are picked up.

Is this approach considered best practice?
Is there a convenient Kafka streams way to handle this? I don't think there is a concept of a DLQ...
What are the alternative ways to stop Kafka jamming on a "bad message"?
What alternative error handling approaches are there?

For completeness here is my code (pseudo-ish):

class Document {     // Fields }  class AnalysedDocument {      Document document;     String rawValue;     Exception exception;     Analysis analysis;      // All being well     AnalysedDocument(Document document, Analysis analysis) {...}      // Analysis failed     AnalysedDocument(Document document, Exception exception) {...}      // Deserialisation failed     AnalysedDocument(String rawValue, Exception exception) {...} }  KStreamBuilder builder = new KStreamBuilder(); KStream<String, AnalysedPolecatDocument> analysedDocumentStream = builder     .stream(Serdes.String(), Serdes.String(), "master")     .mapValues(new ValueMapper<String, AnalysedDocument>() {          @Override          public AnalysedDocument apply(String rawValue) {              Document document;              try {                  // Deserialise                  document = ...              } catch (Exception e) {                  return new AnalysedDocument(rawValue, exception);              }              try {                  // Perform analysis                  Analysis analysis = ...                  return new AnalysedDocument(document, analysis);              } catch (Exception e) {                  return new AnalysedDocument(document, exception);              }          }     });  // Branch based on whether analysis mapping failed to produce errorStream and successStream errorStream.to(Serdes.String(), customPojoSerde(), "error"); successStream.to(Serdes.String(), customPojoSerde(), "analysed");  KafkaStreams streams = new KafkaStreams(builder, config); streams.start();

Any help greatly appreciated.

779

asked Mar 08 '17 08:03

bm1729

1 Answers

Right now, Kafka Streams offers only limited error handling capabilities. There is work in progress to simplify this. For now, your overall approach seems to be a good way to go.

One comment about handling de/serialization errors: handling those error manually, requires you to do de/serialization "manually". This means, you need to configure ByteArraySerdes for key and value for you input/output topic of your Streams app and add a map() that does the de/serialization (ie, KStream<byte[],byte[]> -> map() -> KStream<keyType,valueType> -- or the other way round if you also want to catch serialization exceptions). Otherwise, you cannot try-catch deserialization exceptions.

With your current approach, you "only" validate that the given string represents a valid document -- but it could be the case, that the message itself is corrupted and cannot be converted into a String in the source operator in the first place. Thus, you don't actually cover deserialization exception with you code. However, if you are sure a deserialization exception can never happen, you approach would be sufficient, too.

Update

This issues is tackled via KIP-161 and will be included in the next release 1.0.0. It allows you to register an callback via parameter default.deserialization.exception.handler. The handler will be invoked every time a exception occurs during deserialization and allows you to return an DeserializationResponse (CONTINUE -> drop the record an move on, or FAIL that is the default).

Update 2

With KIP-210 (will be part of in Kafka 1.1) it's also possible to handle errors on the producer side, similar to the consumer part, by registering a ProductionExceptionHandler via config default.production.exception.handler that can return CONTINUE.

189

answered Sep 23 '22 00:09

Matthias J. Sax

Related questions
                            
                                Manage empty list/invalid input when finding max/min value of list (Python)
                            
                                Global exception handling in ASP.NET Web API 2.1 with NLog?
                            
                                Should I use panic or return error?
                            
                                Why global.asax Application_Error method does not catch exceptions thrown by ASMX service?
                            
                                How to catch all exceptions in Flex?
                            
                                Error 1053 the service did not respond to the start or control request
                            
                                UnhandledPromiseRejectionWarning: Error: You must `await server.start()` before calling `server.applyMiddleware()` at ApolloServer
                            
                                Is try-catch like error handling possible in ASP Classic?
                            
                                What is the idiomatic way to return an error from a function with no result if successful?
                            
                                javascript: how to display script errors in a popup alert?
                            
                                "was unexpected at this time."
                            
                                Panic recover in Go v.s. try catch in other languages
                            
                                How to tell lapply to ignore an error and process the next thing in the list?
                            
                                How to set errno value?
                            
                                Ruby on Rails i18n - Want To Translate Custom Messages in Models
                            
                                Why does $.getJSON silently fail?
                            
                                Should a Promise.reject message be wrapped in Error?
                            
                                Is there a way to catch an Exception without having to create a variable?
                            
                                Detecting Webview Error and Show Message
                            
                                Combine multiple error strings

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Handling bad messages using Kafka's Streams API

Tags:

error-handling

apache-kafka

apache-kafka-streams

bm1729

People also ask

1 Answers

Matthias J. Sax

Recent Activity

Donate For Us