After deploying some Apache Kafka instances on remote nodes I observed problem with kafka-server-stop.sh
script that is part of Kafka archive.
By default it contains:
#!/bin/sh
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ps ax | grep -i 'kafka\.Kafka' | grep java | grep -v grep | awk '{print $1}' | xargs kill -SIGTERM
and this script works great if I execute apache kafka as not background process, for example:
/var/lib/kafka/bin/kafka-server-start.sh /var/lib/kafka/config/server.properties
also it works when I execute it as background process:
/var/lib/kafka/bin/kafka-server-start.sh /var/lib/kafka/config/server.properties &
but on my remote nodes I execute it (with the use of Ansible) with this python script:
#!/usr/bin/env python
import argparse
import os
import subprocess
KAFKA_PATH = "/var/lib/kafka/"
def execute_command_pipe_output(command_to_call):
return subprocess.Popen(command_to_call, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
def execute_command_no_output(command_to_call):
with open(os.devnull, "w") as null_file:
return subprocess.Popen(command_to_call, stdout=null_file, stderr=subprocess.STDOUT)
def start_kafka(args):
command_to_call = ["nohup"]
command_to_call += [KAFKA_PATH + "bin/zookeeper-server-start.sh"]
command_to_call += [KAFKA_PATH + "config/zookeeper.properties"]
proc = execute_command_no_output(command_to_call)
command_to_call = ["nohup"]
command_to_call += [KAFKA_PATH + "bin/kafka-server-start.sh"]
command_to_call += [KAFKA_PATH + "config/server.properties"]
proc = execute_command_no_output(command_to_call)
def stop_kafka(args):
command_to_call = [KAFKA_PATH + "bin/kafka-server-stop.sh"]
proc = execute_command_pipe_output(command_to_call)
for line in iter(proc.stdout.readline, b''):
print line,
command_to_call = [KAFKA_PATH + "bin/zookeeper-server-stop.sh"]
proc = execute_command_pipe_output(command_to_call)
for line in iter(proc.stdout.readline, b''):
print line,
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Starting Zookeeper and Kafka instances")
parser.add_argument('action', choices=['start', 'stop'], help="action to take")
args = parser.parse_args()
if args.action == 'start':
start_kafka(args)
elif args.action == 'stop':
stop_kafka(args)
else:
parser.print_help()
after executing
manage-kafka.py start
manage-kafka.py stop
Zookeeper is shutdown (as it should be) but Kafka is still running.
What is more interesting, when I invoke (by hand)
nohup /var/lib/kafka/bin/kafka-server-stop.sh
or
nohup /var/lib/kafka/bin/kafka-server-stop.sh &
kafka-server-stop.sh
properly shutdowns Kafka instance. I suspect this problem may be caused by some Linux/Python thing.
Go to the Kafka home directory and execute the command ./bin/kafka-server-start.sh config/server. properties . Stop the Kafka broker through the command ./bin/kafka-server-stop.sh .
Zookeeper and Kafka are written in Java so you'll need JDK.
Use 'systemctl status kafka' to check the status.
Kafka brokers need to finish the shutdown process before the zookeepers do.
So start the zookeepers, then the kafka brokers will retry the shutdown process.
I had a similar case. The problem was that my config was not waiting for the kafka brokers to shutdown. Hope this helps somebody. It took me a while to figure out...
I faced this issue a lot before figuring out a brute face way to solve the issue. So what has happened is Kafka closed down abruptly but the port is still in use.
Follow the following steps:
lsof -t -i :YOUR_PORT_NUMBER
. ##this is for mackill -9 process_id
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With