I have read thread boostrap-server vs zookeeper in consumer console but that's not clear my doubt.
My doubt is, let say we have zookeeper running at localhost:2181, three broker servers are running at localhost:9092, localhost:9093, localhost:9094 and we have one topic my_topic with partition 3 and replication 1 and topic is shared by brokers because it has three partitions.
In new version of Apache-Kafka when we are running consumer console so we need to pass --bootstrap-server localhost:9092
which is one of
broker address but in earlier version we are passing zookeeper address.
So when we are running consumer to consume message from the topic my_topic
, we are passing parameter --bootstrap-server localhost:9092
which is nothing but one of the broker address, So my question is, are we restricting consumer that you have to consume messages only from that broker and if it is than let say if that broker is down itself, so how consumer will read the messages from that topic. I didn't understand how is it working, may someone please clear it.
Older Version command for run consumer(< 1.0)bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from- beginning --topic my_topic
Newer version command for run consumer( >= 1.0) bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic my_topic
First of all, zookeeper is needed only for high level consumer. SimpleConsumer does not require zookeeper to work. The main reason zookeeper is needed for a high level consumer is to track consumed offsets and handle load balancing. Now in more detail.
Confluent views ZooKeeper's deprecation as an important move for the Kafka community, said Jun Rao, Kafka's co-creator and co-founder of Confluent. “It makes deployment/operation much simpler and improves the scalability by a factor of 10 because of more efficient handling of metadata.
At a detailed level, ZooKeeper handles the leadership election of Kafka brokers and manages service discovery as well as cluster topology so each broker knows when brokers have entered or exited the cluster, when a broker dies and who the preferred leader node is for a given topic/partition pair.
Yes, Zookeeper is must by design for Kafka. Because Zookeeper has the responsibility a kind of managing Kafka cluster. It has list of all Kafka brokers with it. It notifies Kafka, if any broker goes down, or partition goes down or new broker is up or partition is up.
In the previous Kafka version (before 0.9.0), the consumer needed the connection to Zookeeper for committing the offset and for getting topics metadata as well. Starting from 0.9.0, the consumer offeset is saved in a Kafka topic (__consumer_offset) and the connection to Zookeeper is not needed anymore.
What you specify in the --bootstrap-server
parameter is exactly what the name .. says. It's a bootstrap servers list: it means that the consumer connects to brokers you specify and ask for metadata about the topics it wants to consume. It's not limited to consume messages only from brokers listed in the --bootstrap-server
parameter.
Let's say you specify "kafka1:9092" as bootstrap server (in a cluster where you have 3 brokers as you said). After connecting, the consumer sends a metadata request for getting information about "my_topic". The "kafka1" server could reply "I am not the leader for partition 0 of my_topic, here the broker which is leader for that kafka2". At this point, the consumer connects to "kafka2" broker for starting to get messages.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With