Does Kafka Streams internally compute watermarks? Is it possible to observe the result of a window (only) when it completes (i.e. when watermark passes end of window)?
Back in May 2017, we laid out why we believe that Kafka Streams is better off without a concept of watermarks or triggers, and instead opts for a continuous refinement model. This article explains how we are fundamentally sticking with this model, while also opening the door for use cases that are incompatible with continuous refinement.
Kafka Streams supports stateless and stateful operations, but Kaka Consumer only supports stateless operations. Kafka Consumer offers you the capability to write in several Kafka Clusters, whereas Kafka Streams lets you interact with a single Kafka Cluster only. Here are the steps you can follow to connect Kafka Streams to Confluent Cloud:
Kafka Streams provides two abstractions for Streams and Tables. KStream handles the stream of records. On the other hand, KTable manages the changelog stream with the latest state of a given key. Each data record represents an update. There is another abstraction for not partitioned tables.
Your event stream data comes in from Kafka through the source nodes at the top of the topology, flows through the user processor nodes where custom-logic operations are performed, and exits through the sink nodes to a new Kafka topic.
Kafka Streams does not use watermarks internally, but a new feature in 2.1.0 lets you observe the result of a window when it closed. It's called Suppressed
, and you can read about it in the docs: Window Final Results:
KGroupedStream<UserId, Event> grouped = ...;
grouped
.windowedBy(TimeWindows.of(Duration.ofHours(1)).grace(ofMinutes(10)))
.count()
.suppress(Suppressed.untilWindowCloses(unbounded()))
Does Kafka Streams internally compute watermarks?
No. Kafka Streams follows a continuous update processing model that does not require watermarks. You can find more details about this online:
Is it possible to observe the result of a window (only) when it completes (i.e. when watermark passes end of window)?
You can observe of result of a window at any point in time. Either via subscribing to the result changelog stream via for example KTable#toStream()#foreach()
(ie, a push based approach), or via Interactive Queries that let you query the result window actively (ie, a pull based approach).
As mentioned by @Dmitry, for the push based approach, you can also use suppress()
operator if you are only interested in a window's final result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With