I'm using kafka 0.8.2 & one of my kafka servers died (no way to recover the data on the disks). There is a topic, with replication of 1, with one of the partitions on the dead server. I thought a reassignment would move the metadata for that partition to a new server without needing the data, but the reassignment is stuck to in progress
.
I ran:
$ /opt/kafka/kafka/bin/kafka-reassign-partitions.sh --zookeeper myzookeeper.my.com --reassignment-json-file new_assignment.json --verify
Status of partition reassignment:
Reassignment of partition [topicX,1] is still in progress
This will never succeed since the dead server is never coming back.
In the new server's log, I saw:
[2015-05-28 06:25:15,401] INFO Completed load of log topicX-1 with log end offset 0 (kafka.log.Log)
[2015-05-28 06:25:15,402] INFO Created log for partition [topicX,1] in /mnt2/data/kafka with properties {segment.index.bytes -> 10485760, file.delete.delay.ms -> 60000, segment.bytes -> 536870912, flush.ms -> 9223372036854775807, delete.retention.ms -> 86400000, index.interval.bytes -> 4096, retention.bytes -> -1, min.insync.replicas -> 1, cleanup.policy -> delete, unclean.leader.election.enable -> true, segment.ms -> 604800000, max.message.bytes -> 1000012, flush.messages -> 9223372036854775807, min.cleanable.dirty.ratio -> 0.5, retention.ms -> 259200000, segment.jitter.ms -> 0}. (kafka.log.LogManager)
[2015-05-28 06:25:15,403] WARN Partition [topicX,1] on broker 4151132: No checkpointed highwatermark is found for partition [topicX,1] (kafka.cluster.Partition)
[2015-05-28 06:25:15,405] INFO [ReplicaFetcherManager on broker 4151132] Removed fetcher for partitions (kafka.server.ReplicaFetcherManager)
[2015-05-28 06:25:15,408] INFO [ReplicaFetcherManager on broker 4151132] Added fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
[2015-05-28 06:25:15,411] INFO [ReplicaFetcherManager on broker 4151132] Removed fetcher for partitions (kafka.server.ReplicaFetcherManager)
[2015-05-28 06:25:15,413] INFO [ReplicaFetcherManager on broker 4151132] Added fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
Is there a way to force it to complete or abort the reassignment action?
Limits on partitions:maximum 4000 partitions per broker (in total; distributed over many topics) maximum 200,000 partitions per Kafka cluster (in total; distributed over many topics)
A Kafka cluster should have a maximum of 200,000 partitions across all brokers when managed by Zookeeper. The reason is that if brokers go down, Zookeeper needs to perform a lot of leader elections. Confluent still recommends up to 4,000 partitions per broker in your cluster.
If you want to change the number of partitions or replicas of your Kafka topic, you can use a streaming transformation to automatically stream all of the messages from the original topic into a new Kafka topic that has the desired number of partitions or replicas.
You can abort the assignment by deleting the "/admin/reassign_partitions" zk node on your zookeeper cluster using zookeeper shell, and move the partitions that are assigned to the dead broker to new nodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With