I'm not sure if it's already answered. As I didn't get proper explanation, posting my question here.
Why kafka streams state.dir
is stored under /tmp/kafka-streams
?
I know I can change the path by providing the state dir config in the stream code like below
StreamsConfig.STATE_DIR_CONFIG,"/var/abc-Streams"
But will there be any impact of changing the directory?
or
Can I configure the state DB in an application directory and not in /tmp
.
As per the confluent documentation, for :
Stateful
operations :
automatically creates and manages such state stores when you are calling stateful operators such as count() or aggregate(), or when you are windowing a stream
but didn't specify where exactly it's being stored.
ANy thoughts?
Why kafka streams state.dir is stored under /tmp/kafka-streams?
There are several reasons.
/tmp
directory has a default write permission. So you don't have to struggle with write permissions as a beginner./tmp
directory is short lived directory. On each system reboot, it is cleaned, hence you don't experience the over flooded disk storage in case you forgot to delete the state.dir
. Downside is, you lose the states from previous run hence you need to rebuild the states from scratch. If you want to reuse the states stored in state.dir
, you should store it somewhere except /tmp
.
All the state-stores are stored in the location specified in state.dir
. If not specified, it is /tmp/kafka-streams/<app-id>
directory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With