Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache-Kafka, batch.size vs buffer.memory

I'm trying to figure out the difference between the settings batch.size and buffer.memory in Kafka Producer.

As I understand batch.size: It's the max size of the batch that can be sent.

The documentation describes buffer.memory as: the bytes of memory the Producer can use to buffer records waiting to be sent.

I don't understand the difference between these two. Can someone explain?

Thanks

like image 615
calleman123 Avatar asked Apr 04 '18 10:04

calleman123


People also ask

What is buffer memory in Kafka?

buffer. memory represents the total bytes of memory that the producer can use to buffer records waiting to be sent to the server. The default buffer. memory is 32MB. If the producer sends the records faster than they can be delivered to the server, the buffer.

What is buffer size in Kafka?

Kafka keeps data in Buffer as per buffer. memory (32 MB in my case).

What is batch size in Kafka?

batch. size is the maximum number of bytes that will be included in a batch. The default is 16KB . Increasing a batch size to 32KB or 64KB can help increase the compression, throughput, and efficiency of requests. Any message that is bigger than the batch size will not be batched.

What does ACKS =- 1 mean in Kafka?

With acks = -1, the producer waits for the ack. This ack is sent by the broker as above but only after having the messages replicated to all the replica followers on the other brokers.


2 Answers

In my opinion,

batch.size: The maximum amount of data that can be sent in a single request. If batch.size is (32*1024) that means 32 KB can be sent out in a single request.

buffer.memory: if Kafka Producer is not able to send messages(batches) to Kafka broker (Say broker is down). It starts accumulating the message batches in the buffer memory (default 32 MB). Once the buffer is full, It will wait for "max.block.ms" (default 60,000ms) so that buffer can be cleared out. Then it's throw exception.

like image 151
Shiva Garg Avatar answered Oct 23 '22 13:10

Shiva Garg


Both of these producer configurations are described on the Confluent documentation page as following:

  • batch.size

Kafka producers attempt to collect sent messages into batches to improve throughput. With the Java client, you can use batch.size to control the maximum size in bytes of each message batch.

  • buffer.memory

Use buffer.memory to limit the total memory that is available to the Java client for collecting unsent messages. When this limit is hit, the producer will block on additional sends for as long as max.block.ms before raising an exception.

like image 36
Kyrylo Bulat Avatar answered Oct 23 '22 12:10

Kyrylo Bulat