I need to automate a rolling restart of a kafka cluster (3 kafka brokers). I can easily do it manually - restart one after the other, while checking the log to see when it's fine (e.g., when the new process has joined the cluster).
What is a good way to automate this check? How can I ask the broker whether it's up and running, connected to its peers, all topics up-to-date and such? In my restart script, I have access to the metrics, but to be frank, I did not really see one there which gives me a clear picture.
Another way would be to ask what a good "readyness" probe would be that does not simply check some TCP/IP port, but looks at the actual server...
Restart Kafka CFK automatically restarts Kafka clusters when required. It restarts one broker at a time, starting with the highest numbered broker to 0, i.e., broker-n to broker-0, and checks that there are no under replicated partitions on the broker before proceeding to the next broker.
A rolling restart means that only one Kafka broker is restarted at a time. The rolling restart doesn't proceed to restart another broker until the first one has been started again and is in sync with the cluster. This keeps your cluster online all the time, and with no message lost.
There are 2 ways to get the list of available brokers in a Kafka cluster. Both with the help of scripts from zookeeper. Zookeeper manages the leader election and other coordination things for a Kafka cluster. So Zookeeper has a list of all the Kafka brokers in the cluster.
if a broker dies, then kafka divides up leadership of its topic partitions to the remaining brokers in the cluster.
I would suggest exposing JMX metrics and tracking the following for cluster health
server.properties
make sure there are none in the metric counts)Also, Yelp has tooling for rolling restarts implemented in Python, which requires Jolokia JMX Agents installed on the brokers, and it polls the metrics to make sure some of the above conditions are true
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With