Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding the max.inflight property of kafka producer

I work on a bench of my Kafka cluster in version 1.0.0-cp1.

In part of my bench who focus on the max throughput possible with ordering guarantee and no data loss (a topic with only one partition), need I to set the max.in.flight.requests.per.connection property to 1?

I've read this article

And I understand that I only have to set the max.in.flight to 1 if I enable the retry feature at my producer with the retries property.

Another way to ask my question: Only one partition + retries=0 (producer props) is sufficient to guarantee the ordering in Kafka?

I need to know because increase the max.in.flight increases drastically the throughput.

like image 832
Quentin Geff Avatar asked Apr 12 '18 17:04

Quentin Geff


3 Answers

Yes, you must set the max.in.flight.requests.per.connection property to 1. In the article you have read it was an initial mistake (currently corrected) where author wrote:

max.in.flights.requests.per.session

which doesn't exist in the Kafka documentation.

This errata comes probably from the book "Kafka The Definitive Guide" (1st edition) where you can read in the page 52:

<...so if guaranteeing order is critical, we recommend setting in.flight.requests.per.session=1 to make sure that while a batch of messages is retrying, additional messages will not be sent ...>

like image 116
David Zamora Avatar answered Nov 11 '22 10:11

David Zamora


Your use case is slightly unclear. You mention ordering and no data loss but don't specify if you tolerate duplicate messages. So it's unclear if you want At least Once (QoS 1) or Exactly Once

Either way, as you're using 1.0.0 and only using a single partition, you should have a look at the Idempotent Producer instead of tweaking the Producer configs. It allows to properly and efficiently guarantee ordering and no data loss.

From the documentation:

Idempotent delivery ensures that messages are delivered exactly once to a particular topic partition during the lifetime of a single producer.

The early Idempotent Producer was forcing max.in.flight.requests.per.connection to 1 (for the same reasons you mentioned) but in the latest releases it can now be used with max.in.flight.requests.per.connection set to up to 5 and still keep its guarantees.

Using the Idempotent Producer you'll not only get stronger delivery semantics (Exactly Once instead of At least Once) but it might even perform better!

I recommend you check the delivery semantics [in the docs] [in the docs]:http://kafka.apache.org/documentation/#semantics


Back to your question

Yes without the idempotent (or transactional) producer, if you want to avoid data loss (QoS 1) and preserve ordering, you have to set max.in.flight.requests.per.connection to 1, allow retries and use acks=all. As you saw this comes at a significant performance cost.

like image 33
Mickael Maison Avatar answered Nov 11 '22 09:11

Mickael Maison


imo, it is invaluable to also know about this issue that makes things far more interesting and slightly more complicated.

When you enable enable.idempotence=true , every time you send a message to the broker, you also send a sequence number, starting from zero. Brokers store that sequence number too on their side. When you make a next request to the broker, let’s say with sequence_id=3, the broker can look at its currently stored sequence number and say :

  • if its 4 - good, its a new batch of records
  • if its 3 - its a duplicate
  • if its 5 (or higher), it means messages were lost

And now max.inflight.requests.per.connection . A producer can send as many as this value concurrent requests without actually waiting for an answer from the broker. When we reach 3 (let’s say max.inflight.requests.per.connection=3) , we start to ask the broker for the previous results (at the same time we can’t process any batches now even if they are ready).

Now, for the sake of the example, let’s say the broker says this : “1 was OK, I stored it”, “2 has failed” and now the important part: because 2 failed, the only possible thing you can get for 3 is “out of order”, which means it did not store it. The client now knows that it needs to reprocess 2 and 3 and it creates a List and resends them - in that exact order; if retry is enabled.

This explanation is probably over simplified, but this is my basic understanding after reading the source code a bit.

like image 1
Eugene Avatar answered Nov 11 '22 10:11

Eugene