If I have multiple brokers, which broker should my producer use? Do I need to manually switch the broker to balance the load? Also why does the consumer only need a zookeeper endpoint instead of a broker endpoint? quick example from tutorial: <pre class="prettyprint"><code>> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test > bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning </code></pre>

<blockquote> which broker should my producer use? Do I need to manually switch the broker to balance the load? </blockquote> Kafka runs on cluster, meaning set of nodes, so while producing anything you need to tell him the <code>LIST</code> of brokers that you've configured for your application, below is a small note taken from their documentation. <blockquote> “metadata.broker.list” defines where the Producer can find a one or more Brokers to determine the Leader for each topic. This does not need to be the full set of Brokers in your cluster but should include at least two in case the first Broker is not available. No need to worry about figuring out which Broker is the leader for the topic (and partition), the Producer knows how to connect to the Broker and ask for the meta data then connect to the correct Broker. </blockquote> Hope this clear some of your confusion <blockquote> Also why does the consumer only need a zookeeper endpoint instead of a broker endpoint </blockquote> This is not technically correct, as there are two types of APIs available, High level and Low level consumer. The high level consumer basically takes care of most of the thing like leader detection, threading issue, etc. but does not provide much control over messages which exactly the purpose of using the other alternatives Simple or Low level consumer, in which you will see that you need to provide the brokers, partition related details. So Consumer need zookeeper end point only when you are going with the high level API, in case of using Simple you do need to provide other information

Why does kafka producer take a broker endpoint when being initialized instead of the zk

Tags:

apache-kafka

If I have multiple brokers, which broker should my producer use? Do I need to manually switch the broker to balance the load? Also why does the consumer only need a zookeeper endpoint instead of a broker endpoint?

quick example from tutorial:

> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test  > bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning

640

asked Mar 16 '14 23:03

Erben Mo

2 Answers

which broker should my producer use?
Do I need to manually switch the broker to balance the load?

Kafka runs on cluster, meaning set of nodes, so while producing anything you need to tell him the LIST of brokers that you've configured for your application, below is a small note taken from their documentation.

“metadata.broker.list” defines where the Producer can find a one or more Brokers to determine the Leader for each topic. This does not need to be the full set of Brokers in your cluster but should include at least two in case the first Broker is not available. No need to worry about figuring out which Broker is the leader for the topic (and partition), the Producer knows how to connect to the Broker and ask for the meta data then connect to the correct Broker.

Hope this clear some of your confusion

Also why does the consumer only need a zookeeper endpoint instead of a broker endpoint

This is not technically correct, as there are two types of APIs available, High level and Low level consumer.

The high level consumer basically takes care of most of the thing like leader detection, threading issue, etc. but does not provide much control over messages which exactly the purpose of using the other alternatives Simple or Low level consumer, in which you will see that you need to provide the brokers, partition related details.

So Consumer need zookeeper end point only when you are going with the high level API, in case of using Simple you do need to provide other information

137

answered Sep 23 '22 00:09

user2720864

Kafka sets a single broker as the leader for each partition of each topic. The leader is responsible for handling both reads and writes to that partition. You cannot decide to read or write from a non-Leader broker.

So, what does it mean to provide a broker or list of brokers to the kafka-console-producer ? Well, the broker or brokers you provide on the command-line are just the first contact point for your producer. If the broker you list is not the leader for the topic/partition you need, your producer will get the current leader info (called "topic metadata" in kafka-speak) and reconnect to other brokers as necessary before sending writes. In fact, if your topic has multiple partitions it may even connect to several brokers in parallel (if the partition leaders are different brokers).

Second q: why does the consumer require a zookeeper list for connections instead of a broker list? The answer to that is that kafka consumers can operate in "groups" and zookeeper is used to coordinate those groups (how groups work is a larger issue, beyond the scope of this Q). Zookeeper also stores broker lists for topics, so the consumer can pull broker lists directly from zookeeper, making an additional --broker-list a bit redundant.

answered Sep 23 '22 00:09

dpkp

Related questions
                            
                                kafka 8 and memory - There is insufficient memory for the Java Runtime Environment to continue
                            
                                Kafka10.1 heartbeat.interval.ms, session.timeout.ms and max.poll.interval.ms
                            
                                Fluentd vs Kafka
                            
                                kafka broker not available at starting
                            
                                How to get latest offset for a partition for a kafka topic?
                            
                                Kafka Connect running out of heap space
                            
                                getting "org.apache.kafka.common.network.InvalidReceiveException: Invalid receive (size = 1195725856 larger than 104857600)"
                            
                                Increase the number of messages read by a Kafka consumer in a single poll
                            
                                Monitoring UI for Apache kafka - kafka manager vs kafka monitor [closed]
                            
                                Kafka consumer for multiple topic
                            
                                How does Kafka store offsets for each topic?
                            
                                How do I delete/clean Kafka queued messages without deleting Topic
                            
                                UnsatisfiedLinkError: /tmp/snappy-1.1.4-libsnappyjava.so Error loading shared library ld-linux-x86-64.so.2: No such file or directory
                            
                                Confluent Maven repository not working?
                            
                                Kafka how to read from __consumer_offsets topic
                            
                                Kafka: unable to start Kafka - process can not access file 00000000000000000000.timeindex
                            
                                what is difference between partition and replica of a topic in kafka cluster
                            
                                Kafka and firewall rules
                            
                                Kafka Consumer default Group Id
                            
                                How to create topics in apache kafka?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With