Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to add partitions to an existing topic in Kafka 0.8.2

I have a Kafka cluster running with 2 partitions. I was looking for a way to increase the partition count to 3. However, I don't want to lose existing messages on the topic. I tried stopping Kafka, modifying the server.properties file to increase the number of partitions to 3 and restart Kafka. However, that does not seem to change anything. Using Kafka ConsumerOffsetChecker, I still see it is using only 2 partitions. The Kafka version I am using is 0.8.2.2. In version 0.8.1, there used to be a script called kafka-add-partitions.sh, which I guess might do the trick. However, I don't see any such script in 0.8.2.

  • Is there any way of accomplishing this?

I did experiment with creating a whole new topic and for that one, it does seem to use 3 partitions as per the change in the server.properties file. However, for existing topics, it doesn't seem to care.

like image 865
Asif Iqbal Avatar asked Nov 12 '15 17:11

Asif Iqbal


People also ask

Is it possible to add partitions to an existing topic in Kafka?

Example use case: If you want to change the number of partitions or replicas of your Kafka topic, you can use a streaming transformation to automatically stream all of the messages from the original topic into a new Kafka topic that has the desired number of partitions or replicas.

Can a Kafka topic have multiple partitions?

Kafka Partitioning Fortunately, Kafka does not leave us without options here: It gives us the ability to partition topics. Partitioning takes the single topic log and breaks it into multiple logs, each of which can live on a separate node in the Kafka cluster.

Can you alter topic partitions in any way in Kafka?

Yes provided you are increasing partitions.

How many partition can a topic have in Kafka?

maximum 200,000 partitions per Kafka cluster (in total; distributed over many topics) resulting in a maximum of 50 brokers per Kafka cluster.


2 Answers

Looks like you can use this script instead:

bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic my_topic_name     --partitions 40  

In the code it looks like they do same thing:

 AdminUtils.createOrUpdateTopicPartitionAssignmentPathInZK(topic, partitionReplicaList, zkClient, true) 

kafka-topics.sh executes this piece of code as well as AddPartitionsCommand used by kafka-add-partition script.

However you have to be aware of re-partitioning when using key:

Be aware that one use case for partitions is to semantically partition data, and adding partitions doesn't change the partitioning of existing data so this may disturb consumers if they rely on that partition. That is if data is partitioned by hash(key) % number_of_partitions then this partitioning will potentially be shuffled by adding partitions but Kafka will not attempt to automatically redistribute data in any way.

like image 144
blockR Avatar answered Sep 22 '22 23:09

blockR


For anyone who wants solution for newer Kafka versions.Please follow this method.

Kafka's entire data retention and transfer policy depends on partitions so be careful about effects of increasing partitions. (Kafka's newer versions display warning regarding this) Try to avoid configuration in which one broker has too many leader partitions.

There is simple 3 stage approach to this.

Step 1: Increase the partitions in topics

./bin/kafka-topics.sh --zookeeper localhost:9092 --alter --topic testKafka_5 --partitions 6

Step 2: Create a partitioning json file for given topic

{ "version":1, "partitions":[ {"topic":"testKafka_5","partition":0,"replicas":[0,1,2]}, {"topic":"testKafka_5","partition":1,"replicas":[2,1,0]}, {"topic":"testKafka_5","partition":2,"replicas":[1,2,0]}, {"topic":"testKafka_5","partition":3,"replicas":[0,1,2]}, {"topic":"testKafka_5","partition":4,"replicas":[2,1,0]}, {"topic":"testKafka_5","partition":5,"replicas":[1,2,0]} ]}

Create file with newer partition and replicas. It's better to expand replicas to different brokers but they should be present within same cluster. Take latency into consideration for distant replicas. Transfer the given file to your Kafka.

Step 3: Reassign partitions and verify

./bin/kafka-reassign-partitions.sh --zookeeper localhost:9092 --reassignment-json-file bin/increase-replication-factor.json  --execute  ./bin/kafka-reassign-partitions.sh --zookeeper localhost:9092 --reassignment-json-file bin/increase-replication-factor.json --verify 

You can check the effects of your change using --describe command.

like image 27
c0der512 Avatar answered Sep 25 '22 23:09

c0der512