Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka message codec - compress and decompress

When using kafka, I can set a codec by setting the kafka.compression.codec property of my kafka producer.

Suppose I use snappy compression in my producer, when consuming the messages from kafka using some kafka-consumer, should I do something to decode the data from snappy or is it some built-in feature of kafka consumer?

In the relevant documentation I could not find any property that relates to encoding in kafka consumer (it only relates to the producer).

Can someone clear this?

like image 397
forhas Avatar asked Nov 10 '13 14:11

forhas


Video Answer


1 Answers

As per my understanding goes the de-compression is taken care by the Consumer it self. As mentioned in their official wiki page The consumer iterator transparently decompresses compressed data and only returns an uncompressed message

As found in this article the way consumer works is as follows

The consumer has background “fetcher” threads that continuously fetch data in batches of 1MB from the brokers and add it to an internal blocking queue. The consumer thread dequeues data from this blocking queue, decompresses and iterates through the messages

And also in the doc page under End-to-end Batch Compression its written that

A batch of messages can be clumped together compressed and sent to the server in this form. This batch of messages will be written in compressed form and will remain compressed in the log and will only be decompressed by the consumer.

So it appears that the decompression part is handled in the consumer it self all you need to do is to provide the valid / supported compression type using the compression.codec ProducerConfig attribute while creating the producer. I couldn't find any example or explanation where it says any approach for decompression in the consumer end. Please correct me if I am wrong.

like image 191
user2720864 Avatar answered Oct 04 '22 05:10

user2720864