Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clarification on the --broker-list option and --bootstrap-server option in Kafka-console producer and consumer respectively

Tags:

apache-kafka

In Kafka-console-producer, the --broker-list takes a list of servers. Does the producer connect to all of them? (or) Does the producer uses the list of servers to connect one of them and if that one fails, switches to the next and so on?

Similarly, in Kafka-console-consumer the --bootstrap-server takes a list of Kafka servers. If there are two Kafka servers, do I need to specify both of them in the --bootstrap-server? I tried myself running the consumer with one server (Kafka-server1) and when I stopped Kafka-server1, it continued to receive data for the topic.

like image 870
humility Avatar asked Dec 08 '25 06:12

humility


1 Answers

They both act/are the same.

If you look at the Kafka source code, you'll see both options lead to the same "bootstrap.servers" configuration property

  def producerProps(config: ProducerConfig): Properties = {
    val props =
      if (config.options.has(config.producerConfigOpt))
        Utils.loadProps(config.options.valueOf(config.producerConfigOpt))
      else new Properties

    props ++= config.extraProducerProps

    props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, config.brokerList) // <---- brokerList is passed as BOOTSTRAP_SERVER

Both consumer and producer will connect in a round-robin fashion to the list of provided addresses to create an initial "boostrap" connection to the Kafka Controller, which knows about all available brokers in the cluster at a given time. It is good practice to give at least 3 for high-availability.


If there are two Kafka servers, do I need to specify both of them in the --bootstrap-server?

With regards to having multiple addresses availble to use, in a cloud enviornment, where you might have brokers over availability zones, it is recommended to have at least 2 brokers listed per availability zone, so 6 total for 3 zones.

The address provided for clients could be similified using a load balancer / reverse proxy down to a single kafka.your.network:9092 address, but then you are introducing extra DNS and network hops to figure out the connection for the sake of having a single, well-known, address.

In any case, all available addresses for the brokers will be handed to the clients and then cached locally.

However, it is important to recognize all send/poll requests will only communicate with the singular leader of a TopicPartition, despite how many addresses you give and how many replicas a topic will have.

like image 197
OneCricketeer Avatar answered Dec 11 '25 04:12

OneCricketeer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!