Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

kafka + how to avoid running out of disk storage

Tags:

apache-kafka

I want to described the following case that was on one of our production cluster

We have ambari cluster with HDP version 2.6.4

Cluster include 3 kafka machines – while each kafka have disk with 5 T

What we saw is that all kafka disks was with 100% size , so kafka disk was full and this is the reason that all kafka brokers was failed

df -h /kafka
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb         5T   5T   23M   100% /var/kafka

After investigation we saw that log.retention.hours=7 days

So seems that purging is after 7 days and maybe this is the reason that kafka disks are full with 100% even if they are huge – 5T

What we want to do now – is how to avoid this case in the future?

So

We want to know – how to avoid full used capacity on kafka disks

What we need to set in Kafka config in order to purge the kafka disk according to the disk size – is it possible ?

And how to know the right value of log.retention.hours ? according to the disk size or other?

like image 267
Judy Avatar asked Oct 24 '18 13:10

Judy


People also ask

How much storage does Kafka need?

Furthermore, Kafka uses heap space very carefully and does not require setting heap sizes more than 6 GB. This will result in a file system cache of up to 28-30 GB on a 32 GB machine.

How does Kafka store data on disk?

Kafka wraps compressed messages together Producers sending compressed messages will compress the batch together and send it as the payload of a wrapped message. And as before, the data on disk is exactly the same as what the broker receives from the producer over the network and sends to its consumers.

Does Kafka store data in memory or disk?

Data in Kafka is persisted to disk, checksummed, and replicated for fault tolerance. Accumulating more stored data doesn't make it slower. There are Kafka clusters running in production with over a petabyte of stored data.

What is retention bytes in Kafka?

Kafka allows users to configure retention limits on topics. The retention. bytes configuration is the total number of bytes allocated for messages for each partition of the topic. Once exceeded, Kafka will delete oldest messages.


1 Answers

In Kafka, there are two types of log retention; size and time retention. The former is triggered by log.retention.bytes while the latter by log.retention.hours.

In your case, you should pay attention to size retention that sometimes can be quite tricky to configure. Assuming that you want a delete cleanup policy, you'd need to configure the following parameters to

log.cleaner.enable=true
log.cleanup.policy=delete

Then you need to think about the configuration of log.retention.bytes, log.segment.bytes and log.retention.check.interval.ms. To do so, you have to take into consideration the following factors:

  • log.retention.bytes is a minimum guarantee for a single partition of a topic, meaning that if you set log.retention.bytes to 512MB, it means you will always have 512MB of data (per partition) in your disk.

  • Again, if you set log.retention.bytes to 512MB and log.retention.check.interval.ms to 5 minutes (which is the default value) at any given time, you will have at least 512MB of data + the size of data produced within the 5 minute window, before the retention policy is triggered.

  • A topic log on disk, is made up of segments. The segment size is dependent to log.segment.bytes parameter. For log.retention.bytes=1GB and log.segment.bytes=512MB, you will always have up to 3 segments on the disk (2 segments which reach the retention and the 3rd one will be the active segment where data is currently written to).

Finally, you should do the math and compute the maximum size that might be reserved by Kafka logs at any given time on your disk and tune the aforementioned parameters accordingly. Of course, I would also advice to set a time retention policy as well and configure log.retention.hours accordingly. If after 2 days you don't need your data anymore, then set log.retention.hours=48.

like image 119
Giorgos Myrianthous Avatar answered Oct 02 '22 19:10

Giorgos Myrianthous