I'm experimenting with kafka streams and I have the following setup:
Is there any way to make my KTable "inherit" the retention policy from my topic? So that when records are aged out of the primary topic, they're no longer available in the ktable?
I'm worried about dumping all of the records into the KTable and having the StateStore grow unbounded.
One solution that I can think of is to transform into a windowed stream with hopping windows equal to a TimeToLive for the record, but I'm wondering if there's a better solution in a more native way.
Thanks.
Internally, a KTable is implemented using RocksDB and a topic in Kafka. RocksDB stores the current data of the table (note, that RocksDB is not an in-memory store, and can write to disk). At the same time, each update to the KTable (ie, to RocksDB) is written into the corresponding Kafka topic.
Time Based RetentionUnder this policy, we configure the maximum time a Segment (hence messages) can live for. Once a Segment has spanned configured retention time, it is marked for deletion or compaction depending on configured cleanup policy. Default retention time for Segments is 7 days.
KTable is fully stored in RocksDB (== in memory) When KTable receive null-value record it deletes record from RocksDB (== memory freed up)
It is unfortunately not support atm. There is a JIRA though: https://issues.apache.org/jira/browse/KAFKA-4212
Another possibility would be to insert tombstone messages (<key,null>
) into the input topic. The KTable
would pick those up and delete the corresponding key from the store.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With