Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java, How to get number of messages in a topic in apache kafka

I am using apache kafka for messaging. I have implemented the producer and consumer in Java. How can we get the number of messages in a topic?

like image 405
Chetan Avatar asked Feb 18 '15 09:02

Chetan


People also ask

How do I count messages in a Kafka topic?

Run kcat to count the messages You can count the number of messages in a Kafka topic simply by consuming the entire topic and counting how many messages are read. To do this from the commandline you can use the kcat tool which can act as a consumer (and producer) and is built around the Unix philosophy of pipelines.

How do I find the size of a Kafka topic?

As Martbob very helpfully mentioned, you can do this using kafka-log-dirs. This produces JSON output (on one of the lines). So I can use the ever-so-useful jq tool to pull out the 'size' fields (some are null), select only the ones that are numbers, group them into an array, and then add them together.

What is the message size in Kafka?

The Kafka max message size is 1MB. In this lesson we will look at two approaches for handling larger messages in Kafka. Kafka has a default limit of 1MB per message in the topic.

How do I see messages in Kafka tool?

In the tab “Properties” you can choose Key and message content type. If you added Avro plugin, as was described above you'll see Avro type as well. Choose the type used for your messages and go to the “Data” tab. Now you can see all messages that are in the selected topic.


2 Answers

It is not java, but may be useful

./bin/kafka-run-class.sh kafka.tools.GetOffsetShell \   --broker-list <broker>:<port> \   --topic <topic-name> \   | awk -F  ":" '{sum += $3} END {print sum}' 
like image 115
ssemichev Avatar answered Sep 23 '22 06:09

ssemichev


The only way that comes to mind for this from a consumer point of view is to actually consume the messages and count them then.

The Kafka broker exposes JMX counters for number of messages received since start-up but you cannot know how many of them have been purged already.

In most common scenarios, messages in Kafka is best seen as an infinite stream and getting a discrete value of how many that is currently being kept on disk is not relevant. Furthermore things get more complicated when dealing with a cluster of brokers which all have a subset of the messages in a topic.

like image 41
Lundahl Avatar answered Sep 20 '22 06:09

Lundahl