Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka Streams KTable store with change log topic vs log compacted source topic

I'm building KTable on an input topic and I'm joining with KStream on two Kafka Stream application instances.

The input topic for KTable is already a log compacted topic. So when one of my application instance goes down, another instance state store seems to be refreshed with whole state by reading from the input log compacted topic.

So there is no need to enable logging (change log) for my KTable store?

My source input log compacted topic could have millions of records, so if i enable logging on that KTable state store will it improve my state store refresh time in case of failure or it will not have an effect as the source topic is already log compacted? Thanks!

like image 560
Balmani Avatar asked Nov 08 '22 15:11

Balmani


1 Answers

So there is no need to enable logging (change log) for my KTable store?

That's correct. Kafka Streams will not create an additional changelog topic, but will use the input topic for recovery (no need to duplicate data).

so if i enable logging on that KTable state store

How would you do that?

will it improve my state store refresh time in case of failure or it will not have an effect as the source topic is already log compacted?

In general, you would not gain anything. As you stated correctly, the input topic is compacted anyway, so both topics would contain roughly the same data.

If you want to decrease fail over time, you should configure StandbyTasks via StreamsConfig parameter num.standby.replicas (default is 0, so you could set it to 1). Cf https://docs.confluent.io/current/streams/developer-guide.html#state-restoration-during-workload-rebalance

like image 114
Matthias J. Sax Avatar answered Nov 15 '22 11:11

Matthias J. Sax