Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to evenly distribute messages over partitions in Kafka?

Tags:

apache-kafka

Setting the stage..

Here's a diagram to help explain my problem better:

enter image description here

Now, keep in mind the following points:

  • I have a producer sending messages to 8 partitions of My topic.
  • On the other side, I have 8 consumers, one for each partition.
  • The legacy system has limited resources, and can process at most 8 simultaneous requests.

To make sure I don't overwhelm the legacy system, a consumer will only send one request at a time. Any new message will wait for the current message to finish processing.

Explaining the problem..

Since messages are blocked until the previous message is processed, I want to minimize the time a message will wait before it's processed. To do that I need messages to be distributed equally over the partitions. A massage must not be consumed by a busy consumer when another is free.

For example, if 8 messages are produced simultaneously, each message should be sent to one partition. Therefore, each message will be consumed by one consumer, ensuring the messages are processed concurrently without any lag.

What I tried so far

Since the partitions are assigned correctly to the consumers, I had to assume the producer wasn't evenly delivering messages to the partitions. Which turned out to be the case. Here's what I tried so far to resolve the issue...

Using null keys

The most intuitive solution was to produce records without keys which will basically make the DefaultPartitioner behave like the RoundRobinPartitioner. unfortunately, this solution did not work.

Using null keys and batch.size=0

Since using null keys didn't work, It made sense that messages were being sent in batches breaking the even distribution. Setting the batch size to 0 should've caused the producer to send messages one by one. That didn't work either.

Using RoundRobinPartitioner

This one was weird. The RoundRobinPartitioner distributed messages evenly, but it only used 4 out of the 8 partitions.

Using RoundRobinPartitioner and batch.size=0

This made no difference.

Finally, my question:

I need the producer to send messages in Round Robin fashion one by one without batching. How can I do that?

TL;DR

I need the producer to send messages in Round Robin fashion without batching. How can I do that?

like image 768
Odai Mohammed Avatar asked Dec 09 '25 18:12

Odai Mohammed


1 Answers

I've been meaning to post the answer for a while now. So here it is.

I discovered a bug with how the default round robin partitioner calculated the next partition number. So I implemented a custom partitioner. I called it a Retaining Round Robin partitioner, because it remembers the last partition number to which the last message was written. And then sends the next message to the next partition.

You can find the implementation on my GitHub repo under RetainingRoundRobinPartitioner.java

like image 126
Odai Mohammed Avatar answered Dec 14 '25 17:12

Odai Mohammed



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!