Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to minimize the latency involved in kafka messaging framework?

Tags:

apache-kafka

Scenario: I have a low-volume topic (~150msgs/sec) for which we would like to have a low propagation delay from producer to consumer.

I added a time stamp from a producer and read it at consumer to record the propagation delay, with default configurations the msg (of 20 bytes) showed a propagation delay of 1960ms to 1230ms. No network delay is involved since, I tried on a 1 producer and 1 simple consumer on the same machine.

When I have tried adjusting the topic flush interval to 20ms, it drops to 1100ms to 980ms. Then I tried adjusting the consumers "fetcher.backoff.ms" to 10ms, it dropped to 1070ms - 860ms.

Issue: For a 20 bytes of a msg, I would like to have a propagation delay as low as possible and ~950ms is a higher figure.

Question: Anything I am missing out in configuration? I do welcome comments, delay which you got as minimum.

Assumption: The Kafka system involves the disk I/O before the consumer get the msg from the producer and this goes with the hard disk RPM and so on..


Update: Tried to tune the Log Flush Policy for Durability & Latency.
Following is the configuration:
# The number of messages to accept before forcing a flush of data to disk log.flush.interval=10 # The maximum amount of time a message can sit in a log before we force a flush log.default.flush.interval.ms=100 # The interval (in ms) at which logs are checked to see if they need to be  # flushed to disk. log.default.flush.scheduler.interval.ms=100 

For the same msg of 20 bytes, the delay was 740ms -880ms.

The following statements are made clear in the configuration itself.
There are a few important trade-offs:

  1. Durability: Unflushed data is at greater risk of loss in the event of a crash.
  2. Latency: Data is not made available to consumers until it is flushed (which adds latency).
  3. Throughput: The flush is generally the most expensive operation.

So, I believe there is no way to come down to a mark of 150ms - 250ms. (without hardware upgrade) .

like image 418
Amol M Kulkarni Avatar asked Dec 11 '13 13:12

Amol M Kulkarni


People also ask

How do I reduce Kafka lag?

Consuming concurrency can increase performance. If you store offsets on the zookeeper, it can be bottleneck. Reduce commits of offset and use dedicated zookeeper if possible. The best solution is storing offsets on brokers.

How can I improve my Kafka performance?

Increasing the number of partitions and the number of brokers in a cluster will lead to increased parallelism of message consumption, which in turn improves the throughput of a Kafka cluster; however, the time required to replicate data across replica sets will also increase.

What causes Kafka lag?

There are many things that can be causing the Kafka Consumer Lag, including: big jump in traffic resulting in producing way more Kafka messages. poorly written code. various software bugs and issues resulting in slow processing.


1 Answers

I am not trying to dodge the question but I think that kafka is a poor choice for this use case. While I think Kafka is great (I have been a huge proponent of its use at my workplace), its strength is not low-latency. Its strengths are high producer throughput and support for both fast and slow consumers. While it does provide durability and fault tolerance, so do more general purpose systems like rabbitMQ. RabbitMQ also supports a variety of different clients including node.js. Where rabbitMQ falls short when compared to Kafka is when you are dealing with extremely high volumes (say 150K msg/s). At that point, Rabbit's approach to durability starts to fall apart and Kafka really stands out. The durability and fault tolerance capabilities of rabbit are more than capable at 20K msg/s (in my experience).

Also, to achieve such high throughput, Kafka deals with messages in batches. While the batches are small and their size is configurable, you can't make them too small without incurring a lot of overhead. Unfortunately, message batching makes low-latency very difficult. While you can tune various settings in Kafka, I wouldn't use Kafka for anything where latency needed to be consistently less than 1-2 seconds.

Also, Kafka 0.7.2 is not a good choice if you are launching a new application. All of the focus is on 0.8 now so you will be on your own if you run into problems and I definitely wouldn't expect any new features. For future stable releases, follow the link here stable Kafka release

Again, I think Kafka is great for some very specific, though popular, use cases. At my workplace we use both Rabbit and Kafka. While that may seem gratuitous, they really are complimentary.

like image 90
Paul M Avatar answered Sep 23 '22 17:09

Paul M