Say I have a stream of employees, keyed by empId
, which also includes departmentId
.
I want to aggregate by department. So I do a selectKey(
mapper to get departmentId)
, then groupByKey()
(or I could just do a a groupBy(...)
, I assume), and then, say, count(). What exactly happens? I gather that it does a "repartition". I think what happens is that it writes to an "internal" topic, which I is just a regular topic with a derived name, created automatically. That is, shared by all instances of the stream, not just one (i.e. not local). So the aggregation is across all of the new key, not just those messages from the source stream instance (I think). Is that correct?
I've not found a comprehensive description of repartitioning. Can anybody point me to a good article on this?
Partitioning takes the single topic log and breaks it into multiple logs, each of which can live on a separate node in the Kafka cluster. This way, the work of storing messages, writing new messages, and processing existing messages can be split among many nodes in the cluster.
Finally, a sink is a node in the graph that receives records from upstream nodes and writes them to a Kafka topic. A Topology allows you to construct an acyclic graph of these nodes, and then passed into a new KafkaStreams instance that will then begin consuming, processing, and producing records .
In the Kafka Streams DSL, an input stream of an aggregation operation can be a KStream or a KTable, but the output stream will always be a KTable. This allows Kafka Streams to update an aggregate value upon the out-of-order arrival of further records after the value was produced and emitted.
This preview shows page 3 - 5 out of 5 pages.
What you describe is exactly what is happening.
A repartition step is the same as a through()
(auto-inserted into the processing topology) what is a shortcut of to("topic")
plus builder.stream("topic")
.
It's also illustrated and explained in this blog post: https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With