Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka Consumer needs a long poll duration

Using Kafka/Java with the following configuration:

props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, this.bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG, this.groupId);
props.put(ConsumerConfig.CLIENT_ID_CONFIG, UUID.randomUUID().toString());
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, this.maxPollRecords);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, keyDeserializerClass.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, valueDeserializerClass.getName());
props.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, IsolationLevel.READ_COMMITTED.toString().toLowerCase(Locale.ROOT));

I have a simple polling loop like:

consumer.poll(Duration.ofMillis(200));

I noticed some strange behavior. With a 0 duration, it returns no result. Locally with 200 ms duration I get some results, but in another production environment it never returns results, it needs at least 1s.

In my understanding, the poll method will wait until it finds at least a result. With 0 duration, it should at least return the results that are already arrived, it should not always return no result.

What is the explanation?

like image 949
Rolintocour Avatar asked Mar 08 '19 16:03

Rolintocour


1 Answers

According to the docs :

public ConsumerRecords<K,V> poll​(long timeout)
timeout - The time, in milliseconds, spent waiting in poll if data is not available in the buffer. If 0, returns immediately with any records that are available currently in the buffer, else returns empty. Must not be negative.

So, basically as a poll request blocks the thread in which it is being called, poll duration is the maximum time till which it can block the thread. So ,if the timeout is zero or less than the time taken to make a request and get response from consumer, then no records will be returned.

Just for information, if we make this timeout high and set max.poll.records property of consumer to something we want, suppose max.poll.records : "10" ,so the poll will itself end after 10 records are fetched (even if timeout is large). So ideally network latency is needed to be know , else the trick I mentioned above works fine.

like image 157
aditya Raj Avatar answered Oct 09 '22 07:10

aditya Raj