Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

kafka consumer polling timeout

I am working with Kafka and trying to consume data from it. From the below line, I can poll the data from Kafka.

  while (true) {
    ConsumerRecords<byte[], <byte[]> records = consumer.poll(Long.MAX_VALUE);
    for (ConsumerRecord<byte[], <byte[]> record : records) {
        // retrieve data
    }
  }

My question is what is the benefit I am getting by providing Long.MAX_VALUE as the timeout as compared to if I provide 200 as the timeout. What is the best practice for the system that will be running production.

Can anyone explain me the difference of high timeout vs low timeout and which should be use in production system?

like image 649
john Avatar asked Dec 08 '16 02:12

john


People also ask

What is Kafka consumer poll timeout?

The poll() method is the function a Kafka consumer calls to retrieve records from a given topic. When calling the poll() method, consumers provide a timeout argument. This is the maximum amount of time to wait for records to process before returning.

What is the default poll interval of Kafka consumer?

See this answer for more details. max.poll.interval.ms default value is five minutes, so if your consumerRecords. forEach takes longer than that your consumer will be considered dead.

How do I set the timeout on Kafka?

The default timeout is 1 minute, to change it, open the Kafka Client Configuration > Producer tab > Advance Properties > add max.block.ms and set to desired value (in milliseconds).

How does Kafka consumer poll work?

Kafka Consumer Poll Thread poll() calls are separated by more than max.poll.interval.ms time, then the consumer will be disconnected from the group. This controls the maximum number of records that a single call to poll() will return.


1 Answers

Setting MAX_VALUE is sort of a synchronous message consuming, waiting forever until we got something returned back from the poll, while setting to a lower value gives you a chance that you can decide to do something else other than awaiting. Which should be used depends on your actual scenario.

like image 184
amethystic Avatar answered Oct 09 '22 22:10

amethystic