Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Handle Application Errors in Flink

I am currently wondering how to handle application errors in Apache Flink streaming applications. In general, I see two cases:

  1. Transient errors, where you want the input data to be replayed and processing might succeed on second try. An example would be a dependency on an external service, which is temporarily unavailable.
  2. Permanent errors, where repeated processing will still fail; for example invalid input data.

For the first case, it looks like the common solution is to just throw some exception. Or is there a better way, e.g. a special kind of exception for more efficient handling such as FailedException from Apache Storm Trident (see Error handling in Storm Trident topologies).

For permanent errors, I couldn't find any information online. A map() operation, for example, always has to return something so one cannot just silently drop messages as you would in Trident.

What are the available APIs or best practices? Thanks for your help.

like image 742
F30 Avatar asked Apr 28 '16 10:04

F30


1 Answers

Since this question was asked, there has been some development:

This discussion holds the background of why side outputs should help, key extract:

Side outputs(a.k.a Multi-outputs) is one of highly requested features in high fidelity stream processing use cases. With this feature, Flink can

  • Side output corrupted input data and avoid job fall into “fail -> restart -> fail” cycle
  • Side output sparsely received late arriving events while issuing aggressive watermarks in window computation.

This resulted in jira: FLINK-4460 which has been resolved in Flink 1.1.3 and above.


I hope this helps, if an even more generic solution would be desireable, please think a bit on your usecase and consider to create a jira for it.

like image 185
Dennis Jaheruddin Avatar answered Oct 07 '22 00:10

Dennis Jaheruddin