I read that, "..The ordering operator has to buffer all elements it receives. Then, when it receives a watermark it can sort all elements that have a timestamp that is lower than the watermark and emit them in the sorted order. This is correct because the watermark signals that not more elements can arrive that would be intermixed with the sorted elements..." - https://cwiki.apache.org/confluence/display/FLINK/Time+and+Order+in+Streams
Hence, it seems that the watermark serves as a signal to the following operator, for beginning processing. I guess, that's what also a Trigger does. What's the difference between the two?
You can think of watermarks as special records that tell an operator what (event-) time it is. When an operator receives a watermark, it compares the watermark with its current time and other watermarks it received from different stream partitions. Depending on the comparison, the operator advances its own clock.
Some operators register timers (windows, time-based joins, custom implementations). An operator triggers a timer when the clock of the operator passes the time for which the timer was registered.
So, watermarks and timers are two different things. Watermarks tell an operator what time it is and the operator triggers a timer at the right point in time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With