Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I verify if compression is working correctly in Kafka 0.8.2.2?

Tags:

apache-kafka

I am using Kafka 0.8.2.2 and am trying to set up compression. I am providing the compression-codec (gzip) as an argument to the console producer like below.

./kafka-console-producer.sh --broker-list localhost:171 --compression-codec gzip --topic testTopic

Questions Is this the only place where I need to specify compression? How do I verify if compression is indeed taking place? How do I quantify the benefit I am getting from compression? What files (.index, .log) I should look for and compare the sizes with and without compression to estimate the benefit?

like image 882
Girish Chafle Avatar asked Apr 13 '16 06:04

Girish Chafle


People also ask

How does compression work in Kafka?

Producer-Level Message Compression in Kafka If the producer is sending compressed messages, all the messages in a single producer batch are compressed together and sent as the "value" of a "wrapper message". Compression is more effective the bigger the batch of messages being sent to Kafka!

Which compression is best for Kafka?

Making Kafka compression more effective Batching is especially better with entropy-less encoding like LZ4 and Snappy because these algorithms work the best with repeatable patterns in data. Two main producer properties are responsible for batching: Linger.ms (default is 0) Batch.

Which property is used to specify compression type?

If you set compression. type property in the configuration of the producer, then the messages will be compressed before sending them to the broker. If you set this property in the server configuration, then it specifies how the messages will be compressed in the broker.

Which parameter allows you to set whether compression should be turned on for particular topics?

The Kafka cluster does not retain all the published messages. -- corredThis parameter allows you to set whether compression should be turned on for particular topics.


1 Answers

How to verify if compression is happening?

Use DumpLogSegments tool and substitute your dir location / log file name (default log.dir is /tmp/kafka-logs)

bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files /your_kafka_logs_dir/your_topic-your_partition/00000000000000000000.log --print-data-log | grep compresscodec

You will see something like below:

baseOffset: 0 lastOffset: 0 count: 1 ... compresscodec: NONE ...
baseOffset: 1 lastOffset: 1 count: 1 ... compresscodec: GZIP ...
baseOffset: 2 lastOffset: 2 count: 1 ... compresscodec: SNAPPY ...
baseOffset: 3 lastOffset: 3 count: 1 ... compresscodec: LZ4 ...

More info can be found in documentation here https://kafka.apache.org/documentation/#design_compression

like image 190
Marina Avatar answered Oct 23 '22 05:10

Marina