Trying to understand the logic behind retention period in Apache Kafka. Please help me to understand the situation for the below scenarios.
java.lang.IllegalArgumentException: requirement failed: log.retention.ms must be unlimited (-1) or, equal or greater than 1
You can still set it to zero while using the parameters log.retention.minutes or log.retention.ms
Now, let's come to the point of data deletion. In this situation, the old data won't likely get deleted even after the set retention (say 1 hr, or 1 min) has expired, because one more variable in server.properties called log.segment.bytes plays a major role there. The value of log.segment.bytes is set to 1GB by default. Kafka only performs deletion on a closed segment. So, once a log segment has reached 1GB, only then it is closed, and only after that the retention kicks in. So, you need to reduce the size of log.segment.bytes to some approximate value which is atmost the size of the cumulative investion volume of the data that you are planning to retain for that short duration. E.g. if your retention period is 10 min, and you get roughly 1 MB of data per minute, then you can set the log.segment.bytes=10485760 which is 1024 x 1024 x 10. You can find an example of how retention is dependent both on the data ingestion and time in this thread.
To test this, we can try a small experiment. Let's start Zookeeper and Kafka, create a topic called testand change its retention period to zero.
1) nohup ./zookeeper-server-start.sh ../config/zookeeper.properties &
2) nohup ./kafka-server-start.sh ../config/server.properties &
3) ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
4) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name test --alter --add-config log.retention.ms=0
Now if we insert sufficient records using Kafka-console-producer, even after 2-3 minutes, we'll see the records are not deleted. But now, let's change the log.segment.bytes to 100 bytes.
5) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name test --alter --add-config segment.bytes=100
Now, almost immediately we'll see that old records are getting deleted from Kafka.
server.properties, if we delete/comment out a property, the default value for that property kicks in. I think, the default retention period is 1 week.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With