One kafka consumer for multiple topics vs one consumer for each topic/partition

Question

I am working on data ingestion use case where data comes on multiple topics and had to be pushed to multiple tables based on the topic name. I was trying to understand will having one consumer for all the topics has any performance difference with having one consumer for each topic/partition.

Mickael Maison · Accepted Answer

The performance difference between these 2 scenarios depends on the numbers of brokers, partitions and on the expected throughput.

When the number of brokers, partitions and throughput is high, if you only have a single consumer for all partitions it's very likely it won't be able to cope with all the traffic.

For example, if you have 5 brokers with 5 partitions on each and each partitions has 5MB/s traffic:

if you have a single consumer: it will need to have a connection to each broker and will have to handle 5 x 5 x 5 MB/s = 125MB/s. Depending on your consumer config this might not be feasable. Even if you can handle 125MB/s, this does not scale well if you end up adding more partitions.
if you have multiple consumers: each consumer will grab a subset of the partitions. With 5 consumers, each will only have to handle 25MB/s which is easily feasable with a standard VM.

Kafka's consumer group feature makes it very easy to add consumers on the fly. So you can start with only a single consumer and add more if/when the throughput increases.

One kafka consumer for multiple topics vs one consumer for each topic/partition

Tags:

apache-kafka

Rajesh

1 Answers

Mickael Maison

Recent Activity

Donate For Us

One kafka consumer for multiple topics vs one consumer for each topic/partition

Tags:

apache-kafka

Rajesh

1 Answers

Mickael Maison

Related questions

Recent Activity

Donate For Us