Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Purpose and difference b/w Apache camel Kafka consumer URI option consumerStreams vs consumersCount

Apache Camel Kafka Consumer provides URI options called "consumerStreams" and "consumersCount".

Need to understand the difference and usage scenarios and how it will fit with multi-partition Kafka topic message consumption

like image 344
user3108392 Avatar asked Mar 10 '16 15:03

user3108392


2 Answers

consumerCount controls how many consumer instances are created by the camel endpoint. So if you have 3 partitions in your topic and you have a consumerCount of 3 then you can consume 3 messages (1 per partition) at a time. This setting does exactly what you would expect from the documentation

consumerStreams is a totally different setting and has imho a misleading name AND a misleading documentation.

Currently the documentation (https://github.com/apache/camel/blob/master/components/camel-kafka/src/main/docs/kafka-component.adoc) says:

consumerStreams: Number of concurrent consumers on the consumer

But the source code reveals its real purpose:

consumerStreams configures how many Threads are available for all consumers to be run on. Internally the Kafka endpoint creates one Runnable per consumer. (consumerCount = 3) means 3 Runnables. These runnables are executed by an ThreadPoolExecutorService which is scaled by the consumerStreams setting.

Since the single consumer threads are long running tasks the only purpose of consumerStreams can be to handle reconnection or blocked threads. A higher value for consumerStreams does not lead to more parallelization. And it should better be named consumerThreadPoolSize or something like that.

like image 144
Chris Avatar answered Oct 03 '22 07:10

Chris


I have checked Camel Kafka source code, it seems there is a different use of these parameters overtime.

consumerStreams were used in old versions of the Camel-Kafka component such as 2.13 as you can see here

consumersCount is used in latest versions of the Camel-Kafka component (see here) and it represents the number of org.apache.kafka.clients.consumer.KafkaConsumer that will be instantiated, so you should really use this for multi-partition consumption

it seems they were used together in camel 2.16

like image 40
vortex.alex Avatar answered Oct 03 '22 07:10

vortex.alex