Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens if Zookeeper fails completely?

we have setup a Kafka/Zookeeper Cluster consisting of 3 Brokers. We have one producer, sending messages to one specific Kafka topic and a few consumer groups reading from said topic. Those consumers perform a leader election via Zookeeper for themselves (independent from Kafka).

The versions used are:

  • Kafka: 0.9.0.1
  • Zookeeper: 3.4.6 (included in the Kafka-Package)

All processes are managed by Supervisor. So far, everything works just fine. What we tried now (for testing purposes) was to simply kill off all Zookeeper processes and see what happens.

As we expected, our consumer processes couldn't connect to Zookeeper anymore. But unexpectedly, the Kafka Brokers still worked. Our producer didn't complain at all and was still able to write into the topic. While I couldn't use kafka/bin/kafka-topics.sh or similar, since they all require a zookeeper-parameter, I could still see the actual size of the topic-log grow. After restarting the zookeeper processes, everything again worked just like before.

What we couldn't figure out is now... what actually happened there? We thought, Kafka would require a working Zookeeper-Connection and we couldn't find any explanation for this behaviour online.

like image 330
tehK Avatar asked Jun 14 '16 09:06

tehK


People also ask

What type of failure does ZooKeeper assume?

The ZK server is designed to be "fail fast" meaning that it will shutdown (process exit) if an error occurs that it cannot recover from. As a ZooKeeper serving cluster is highly reliable, this means that while the server may go down the cluster as a whole is still active and serving requests.

Is ZooKeeper single point of failure?

In essence, ZooKeeper was a Single Point of Failure (SPoF) in our stack. SPoF terminology usually refers to single machines or servers, but in this case, it was a single distributed service.

Can Kafka work without ZooKeeper?

However, you can install and run Kafka without Zookeeper. In this case, instead of storing all the metadata inside Zookeeper, all the Kafka configuration data will be stored as a separate partition within Kafka itself.

Why do we need zookeepers?

ZooKeeper is an open source Apache project that provides a centralized service for providing configuration information, naming, synchronization and group services over large clusters in distributed systems. The goal is to make these systems easier to manage with improved, more reliable propagation of changes.


1 Answers

When you have one node of zookeeper, broker will not be able to contact zookeeper, after broker discovers zookeeper is not reachable, broker also will become unreachable. Hence the producer and consumer. In case of producer it starts dropping(reject the record). In case of consumer it can happen that, the read record which is not ack'ed may end up processing again when broker is up and ready...

in case of 3node zk one node failure is acceptable as quorum is still satisfied... but cant afford the 2node failures which will lead to the above consequences...

like image 88
Yogesh BG Avatar answered Oct 02 '22 20:10

Yogesh BG