I'm super new to Kafka. I've installed kafka and zookeeper using homebrew on my mac, and I'm playing around with the quickstart guide.
I've been able to push messages onto Kafka using the following command and STDIN
kafka-console-producer --broker-list localhost:9092 --topic test
and I can read things off using
kafka-console-consumer --bootstrap-server localhost:9092 --topic test --from-beginning
What's not clear to me is how I use offsets. It's my understanding that each message added to a topic will have a numerical, incremental offset value. However, if I try to do something like this
kafka-console-consumer --bootstrap-server localhost:9092 --topic test --offset 1
I get a non-zero status code and shows me no messages (other than the usual help/usage information)
I also can't get the latest or earliest keywords to work
kafka-console-consumer --bootstrap-server localhost:9092 --topic test --offset earliest
kafka-console-consumer --bootstrap-server localhost:9092 --topic test --offset latest
Both of the above also return non-zero status codes.
Do I fundamentally misunderstand offsets? If not, is there a way to list all messages with their offsets? Finally -- what's the simpliest example of the --offset
flag for the kafka-console-consumer
?
Short Answer If your Kafka topic is in Confluent Cloud, use the kafka-console-consumer command with the --partition and --offset flags to read from a specific partition and offset. You can also read messages from a specified partition and offset using the Confluent Cloud Console: Run it.
How to find the current consumer offset? Use the kafka-consumer-groups along with the consumer group id followed by a describe. You will see 2 entries related to offsets – CURRENT-OFFSET and LOG-END-OFFSET for the partitions in the topic for that consumer group.
Kafka maintains a numerical offset for each record in a partition. This offset acts as a unique identifier of a record within that partition, and also denotes the position of the consumer in the partition.
You can use the bin/kafka-topics.sh shell script along with the Zookeeper service URL as well as the –list option to display a list of all the topics in the Kafka cluster. You can also pass the Kafka cluster URL to list all topics.
If you looked at the output after you gave the offset value, it would have said that you needed to specify a partition (at the top of the help section)
Topics are subdivided into partitions, and an offset of 1 could only exist on one partition out of potentially hundreds, so you must specify it
Regarding displaying the offsets, lookup the GetOffsetShell
command syntax
A nice command line tool that can display the offset with each message is kafkacat
.
kafkacat -b localhost:9092 -C -t test -f 'Topic %t [%p] at offset %o: key %k: %s\n'
It will print out something like
Topic test [5] at offset 111: key "0171bf8102007900e33": {"Message": "1"}
Topic test [2] at offset 123: key "070021b0f001f614c1b": {"Message": "2"}
Since GetOffsetShell
only works with PLAINTEXT
, it can be inconvenient for many.
Good news is that additional properties, including print.offset
seems to have made their way into 2.7 based on the communication in 9099 PR.
It must be possible to use print.offset=true
now!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With