Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cleanup Policy:Compact/Delete and log.retention

Tags:

apache-kafka

I have a question about Kafka Topic cleanup policies and their interaction of log.retention....

For example, if I set cleanup.policy to compact, compaction will only start after the retention time of the topic or retention time has no effect for compaction?

Second part of the question, if I use compact,delete together, and I have log.retention for lets say 1 day, topic compacted all the time but content of the topic will be deleted after one day? or compaction and delete realised after one day?

Thx for answers...

like image 320
posthumecaver Avatar asked Apr 10 '19 11:04

posthumecaver


People also ask

What is cleanup policy compact?

log.cleanup.policy=compactThis policy is the default for Kafka's __consumer_offsets topic. With this policy on a topic, Kafka only stores the most recent value for each key in the topic. Setting the policy to compact only makes sense on topics for which applications produce events that contain both a key and a value.

Is it OK to delete Kafka logs?

To answer your first question, Yes it is ok to delete old Kafka log files. Those are meant for your use only if you want to trace back the history logs.

What are the strategies used by Kafka to clean up its old log segments?

Log Retention (Garbage Collection) is a cleanup strategy to discard (delete) old log segments when their retention time or size limit has been reached. By default there is only a time limit and no size limit. Retention time is controlled by the cluster-wide log.retention.ms, log.

What is retention policy in Kafka?

The most common configuration for how long Kafka will retain messages is by time. The default is specified in the configuration file using the log. retention. hours parameter, and it is set to 168 hours, the equivalent of one week.

What is the retention policy for old log segments?

This string designates the retention policy to use on old log segments. The default policy ("delete") will discard old segments when their retention time or size limit has been reached. The "compact" setting will enable log compaction on the topic. Specify the final compression type for a given topic.

What is log cleanup policy in config config?

Config ‘log.cleanup.policy’ can have a value among ‘delete’, ‘compact’ or ‘compact, delete’ What is Log Cleaner? Log cleaner does Log compaction. Log cleaner is a pool of background compaction threads. How each compaction thread works?

How do I apply retention policies to delete data automatically?

You can use a job queue entry to apply retention policies to delete data automatically, or you can manually apply policies. To apply a retention policy automatically, just create and enable a policy. When you enable a policy we create a job queue entry that will apply retention policies according to the retention period you specify.

How do I delete or compact old log segments?

A string that is either "delete" or "compact" or both. This string designates the retention policy to use on old log segments. The default policy ("delete") will discard old segments when their retention time or size limit has been reached. The "compact" setting will enable log compaction on the topic.


1 Answers

Log segments can be deleted or compacted, or both, to manage their size. The topic-level configuration cleanup.policy determines the way the log segments for the topic are managed.

Log cleanup by compaction

If the topic-level configuration cleanup.policy is set to compact,the log for the topic is compacted periodically in the background by the log cleaner.

In a compacted topic,the log only needs to contain the most recent message for each key while earlier messages can be discarded.

There is no need to set log.retention to -1 or any other value. Your topics will be compacted and old messages never deleted (as per compaction rules).

Note that only the inactive file segment can be compacted; active segment will never be compacted.

Log cleanup by using both

You can specify both delete and compact values for the cleanup.policy configuration at the same time. In this case, the log is compacted, but the cleanup process also follows the retention time or size limit settings.

I would suggest you to go through the following links

https://ibm.github.io/event-streams/installing/capacity-planning/

https://kafka.apache.org/documentation/#compaction

https://cwiki.apache.org/confluence/display/KAFKA/KIP-71%3A+Enable+log+compaction+and+deletion+to+co-exist

like image 63
Rohit Yadav Avatar answered Oct 13 '22 17:10

Rohit Yadav