I'm not sure if it's already answered. As I didn't get proper explanation, posting my question here.
Why kafka streams state.dir is stored under /tmp/kafka-streams?
I know I can change the path by providing the state dir config in the stream code like below
StreamsConfig.STATE_DIR_CONFIG,"/var/abc-Streams"
But will there be any impact of changing the directory?
or
Can I configure the state DB in an application directory and not in /tmp.
As per the confluent documentation, for :
Stateful operations :
automatically creates and manages such state stores when you are calling stateful operators such as count() or aggregate(), or when you are windowing a stream
but didn't specify where exactly it's being stored.
ANy thoughts?
Why kafka streams state.dir is stored under /tmp/kafka-streams?
There are several reasons.
/tmp directory has a default write permission. So you don't have to struggle with write permissions as a beginner./tmp directory is short lived directory. On each system reboot, it is cleaned, hence you don't experience the over flooded disk storage in case you forgot to delete the state.dir. Downside is, you lose the states from previous run hence you need to rebuild the states from scratch. If you want to reuse the states stored in state.dir, you should store it somewhere except /tmp.
All the state-stores are stored in the location specified in state.dir. If not specified, it is /tmp/kafka-streams/<app-id> directory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With