Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Limit on the number of topics in Kafka

Tags:

apache-kafka

I have a particular usecase where I might need a very large number of topics in kafka. Essentially this is for timeseries and hence would like to get a general understanding how should I approach this.

I know that theoretically there is no limit, but practically there will be some limits. Would like to get some expert opinion here.

Is it possible to scale to a million topics for instance or even higher?

like image 373
AMM Avatar asked Oct 16 '18 19:10

AMM


1 Answers

Well, there is no fixed numbers defined for topics/partitions on a cluster. But definitely there are some best practices which depicts howto scale the cluster in efficient way.

Actually number of topics itself really not determine the scalability of a cluster. No. of partitions affects more instead of number of topics. Each topic can have one or multiple partitions. The more number of partitions you have, more file handles will be open and that will affect the latency. Also more partitions increase the unavailability.

So when you do Cluster size and capacity planning, follow the below rule for stable cluster.

As a rule of thumb, if you care about latency, it’s probably a good idea to limit the number of partitions per broker to 100 x b x r, where b is the number of brokers in a Kafka cluster and r is the replication factor.

Here is nice blog post by confluent: https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster

Personally I have experienced problem with 5600 topics/23000 partitions ob 6 Broker nodes cluster. And brokers became unavailable due to huge open file handles and we had to scale the cluster to 12 nodes.

like image 72
Nishu Tayal Avatar answered Sep 18 '22 22:09

Nishu Tayal