Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

See size of Kafka Topics in Bytes

Tags:

apache-kafka

For Metrics we meed to see the total size of a Kafka Topic in bytes across all partitions and brokers.

I have been searching for quite a while on how to do this and I haven't worked out if this is possible and how to do it.

We are on V0.82 of Kafka.

like image 839
Nathan English Avatar asked Apr 18 '17 13:04

Nathan English


People also ask

How can I see the size of a Kafka topic?

'kafka-log-dirs --describe --bootstrap-server kafka:9092' will return state of all topics/partitions, '--topic-list' will narrow down that list.

How can I tell the size of a Kafka partition?

And then I discovered the kafka-log-dirs tool. This tool available on the bin folder of Kafka lets you query the size occupied by each partition by broker. You can also specify specific topics to be queried using the --topic-list option. The size is the size occupied by the partition in bytes.

How do you calculate retention bytes?

So for example, if you are generally sending in 200MB a day of messages to a single partition topic, and you want to keep them for 5 days you would set retention. bytes to 1GB (200MB x 5 days). If this was over 10 partitions then you would set retention. bytes = 100MB (1GB / 10 partitions).

How do I get a list of Kafka topics?

You can use the bin/kafka-topics.sh shell script along with the Zookeeper service URL as well as the –list option to display a list of all the topics in the Kafka cluster. You can also pass the Kafka cluster URL to list all topics.


2 Answers

You can see the partition size using the script /bin/kafka-log-dirs.sh

/bin/kafka-log-dirs.sh --describe --bootstrap-server <KafakBrokerHost>:<KafakBrokerPort> --topic-list <YourTopic>
like image 159
Martbob Avatar answered Sep 19 '22 13:09

Martbob


As Martbob very helpfully mentioned, you can do this using kafka-log-dirs. This produces JSON output (on one of the lines). So I can use the ever-so-useful jq tool to pull out the 'size' fields (some are null), select only the ones that are numbers, group them into an array, and then add them together.

kafka-log-dirs \
    --bootstrap-server 127.0.0.1:9092 \
    --topic-list 'topic_of_interest' \
    --describe \
  | grep '^{' \
  | jq '[ ..|.size? | numbers ] | add'

Example output: 67704

I haven't verified if the output makes sense, so you should check that yourself.

like image 44
Cameron Kerr Avatar answered Sep 17 '22 13:09

Cameron Kerr