I'm getting started with Confluent Platform which requires to run Zookeeper (zookeeper-server-start /etc/kafka/zookeeper.properties
) and then Kafka (kafka-server-start /etc/kafka/server.properties
). I am writing an Upstart script that should run both Kafka and Zookeeper. The issue is that Kafka should block until Zookeeper is ready (because it depends on it) but I can't find a reliable way to know when Zookeeper is ready. Here are some attempts in pseudo-code after running the Zookeeper server start:
Use a hardcoded block
sleep 5
Does not work reliably on slower computers and/or waits longer than needed.
Check when something (hopefully Zookeeper) is running on port 2181
wait until $(echo stat | nc localhost ${port}) is not none
This did not seem to work as it doesn't wait long enough for Zookeeper to accept a Kafka connection.
Check the logs
wait until specific string in zookeeper log is found
This is sketchy and there isn't even a string that cannot also be found on error too (e.g. "binding to port [...]").
Is there a reliable way to know when Zookeeper is ready to accept a Kafka connection? Otherwise, I will have to resort to a combination of 1 and 2.
At a detailed level, ZooKeeper handles the leadership election of Kafka brokers and manages service discovery as well as cluster topology so each broker knows when brokers have entered or exited the cluster, when a broker dies and who the preferred leader node is for a given topic/partition pair.
I found that using a timer is not reliable. the second option (waiting for the port) worked for me:
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties && \
while ! nc -z localhost 2181; do sleep 0.1; done && \
bin/kafka-server-start.sh -daemon config/server.properties
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With