We have a 3 host Kafka cluster. We have 136 topics, each of which has 100 partitions, with a replication factor of 3. This makes for 13,600 partitions across our cluster.
Is this a sane configuration of our topics?
It's too many. You should ask yourself if you have (or plan to have soon) enough consumer instances to need that many partitions. Then, if you do plan to have 13k consumer instances, what sort of hardware are you running these brokers on such that they would be able to serve that many consumers? That's even before your consider the additional impact of many partitions pre-1.1 https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
This to me looks like 100 was a round number and seemed future proof. I'd suggest starting at a much lower number per topic (like say 2 or 10) and see if you actually hit scale issues that demand more partitions before trying to jump to expert mode. You can always add more partitions later.
The short answer to your question is 'It depends'. More partitions in a Kafka cluster leads to higher throughput however, you need to be aware that the number of partitions has an impact on availability and latency.
In general more partitions,
You need to study the trade-offs and make sure that you've picked the number of partitions that satisfies your requirements regarding throughput, latency and required resources.
For further details refer to this blog post from Confluent.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With