Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implications of keeping linger.ms at 0

We are using kafka 0.10.2.1. The documentation specifies that a buffer is available to send even if it isn't full-

By default a buffer is available to send immediately even if there is additional unused space in the buffer. However if you want to reduce the number of requests you can set linger.ms to something greater than 0.

However, it also says that the producer will attempt to batch requests even if linger time is set to 0ms-

Note that records that arrive close together in time will generally batch together even with linger.ms=0 so under heavy load batching will occur regardless of the linger configuration; however setting this to something larger than 0 can lead to fewer, more efficient requests when not under maximal load at the cost of a small amount of latency.

Intuitively, it seems that any kind of batching would require some linger time, and the only way to achieve a linger time of 0 would be to make the broker call synchronised. Clearly, keeping the linger time at 0 doesn't appear to harm performance as much as blocking on the send call, but seems to have some impact on performance. Can someone clarify what the docs are saying above?

like image 615
Aditya Vivek Avatar asked Mar 16 '18 07:03

Aditya Vivek


People also ask

What is linger MS in Kafka producer?

linger.ms refers to the time to wait before sending messages out to Kafka. It defaults to 0, which the system interprets as 'send messages as soon as they are ready to be sent'. batch. size refers to the maximum amount of data to be collected before sending the batch.

What does ACKS =- 1 mean in Kafka?

'acks=1' With a setting of 1 , the producer will consider the write successful when the leader receives the record. The leader broker will know to immediately respond the moment it receives the record and not wait any longer. The producer waits for a response.

What is Max Block MS?

max.block.ms is used for producer to block buffer time, serialization time etc. For details look at this one.

What is request timeout MS in Kafka?

request.timeout.ms is the timeout configured on the client side. It says that the client is going to wait this much time for the server to respond to a request. timeout.ms is the timeout configured on the leader in the Kafka cluster. This is the timeout on the server side.

Should I set linger to 0 or 0?

Note that records that arrive close together in time will generally batch together even with linger.ms=0 so under heavy load batching will occur regardless of the linger configuration; however setting this to something larger than 0 can lead to fewer, more efficient requests when not under maximal load at the cost of a small amount of latency.

What is linger MS in Kafka?

Since linger.ms is 0 by default, Kafka won’t batch messages and send each message immediately. The linger.ms property makes sense when you have a large amount of messages to send. It’s like choosing private vehicles over public-transport. Using private vehicles is all good only until the number of people traveling via their own cars is less.

Is it possible to override a linger configuration in a worker?

linger.ms is a producer configuration, so you'd have to specify it as a producer override. For example: producer.override.linger.ms=0, along with enabling overrides in your worker configuration. It works good for me. Thank you

What does 'linger_ms=0' mean?

However, in the documentation , it says that "linger_ms=0 indicates no lingering." Thanks for this awesome package! Hi @amitripshtos, thanks so much for the feedback and issue report - I'm glad you're getting some use out of pykafka.


1 Answers

The docs are saying that even though you set linger time to 0, you might end up with a little bit of batching under load since records are getting added to be sent faster than the send thread can dispatch them. This setting is optimizing for minimal latency. If the measure of performance you really care about is throughput, you'd increase the linger time a bit to batch more and that's what the docs are getting at. Not so much to do with synchronous send in this case. More in depth info

like image 176
dawsaw Avatar answered Oct 12 '22 10:10

dawsaw