Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka - is it possible to alter Topic's partition count while keeping the change transparent to Producers and Consumers?

Tags:

apache-kafka

I am investigating on Kafka to assess its suitability for our use case. Can you please help me understand how flexible is Kafka with changing the number of partitions for an existing Topic?

Specifically,

  1. Is it possible to change the number of partitions without tearing down the cluster?
  2. And is it possible to do that without bringing down the topic?
  3. Will adding/removing partitions automatically take care of redistributing messages across the new partitions?

Ideally, I would want the change to be transparent to the producers and consumers. Does Kafka ensure this?

Update: From my understanding so far, it looks like Kafka's design cannot allow this because it mapping of consumer groups to partitions will have to be altered. Is that correct?

like image 252
Aadith Ramia Avatar asked Jan 04 '19 01:01

Aadith Ramia


People also ask

Can you change the number of partitions in Kafka?

If you want to change the number of partitions or replicas of your Kafka topic, you can use a streaming transformation to automatically stream all of the messages from the original topic into a new Kafka topic that has the desired number of partitions or replicas.

Can Kafka partitions be reduced?

Note: While Kafka allows us to add more partitions, it is NOT possible to decrease the number of partitions of a Topic.

Which command will you use to change partition count in Kafka?

Apache Kafka provides us with alter command to change Topic behaviour and add/modify configurations. We will be using alter command to add more partitions to an existing Topic.

What happens if there are more consumers than partitions in Kafka?

More consumers in a group than partitions means idle consumers. The main way we scale data consumption from a Kafka topic is by adding more consumers to a consumer group. It is common for Kafka consumers to do high-latency operations such as write to a database or a time-consuming computation on the data.


1 Answers

  1. Yes, it it perfectly possible. You just execute the following command against the topic of your choice: bin/kafka-topics.sh --zookeeper zk_host:port --alter --topic <your_topic_name> --partitions <new_partition_count>. Remember, Kafka only allows increasing the number of partitions, because decreasing it would cause in data loss.

    • There's a catch here. Kafka doc says the following:

Be aware that one use case for partitions is to semantically partition data, and adding partitions doesn't change the partitioning of existing data so this may disturb consumers if they rely on that partition. That is if data is partitioned by hash(key) % number_of_partitions then this partitioning will potentially be shuffled by adding partitions but Kafka will not attempt to automatically redistribute data in any way.

  1. Yes, if by bringing down the topic you mean deleting the topic.
  2. Once you've increased the partition count, Kafka would trigger a rebalance, for consumers who are subscribing to that topic, and on subsequent polls, the partitions would get distributed across the consumers. It's transparent to the client code, you don't have to worry about it.

NOTE: As I mentioned before, you can only add partitions, removing is not possible.

like image 199
Bitswazsky Avatar answered Sep 19 '22 23:09

Bitswazsky