Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to enable GC logging for Apache Kafka brokers, while preventing log file overwrites and capping disk space usage

We recently decided to enable GC logging for Apache Kafka brokers on a number of clusters (exact version varies) as a aid to looking into Kafka-related memory and garbage collection problems. We want to do that for running brokers (not for Kafka operations such as "kafka-topics.sh"). We also want to avoid two problems we know might happen:

  • overwriting of the log file when a broker restarts for any reason
  • the logs using too much disk space, leading to disks getting filled (if you keep the cluster running long enough, log files will fill up disk unless managed)

When Java GC logging starts for a process it seems to replace the content of any file that has the same name. This means that unless you are careful, you will lose the GC logging, perhaps when you are more likely to need it.

Setting the environment variable GC_LOG_ENABLED to be "true" before running kafka-server-start.sh enables GC logging, but doesn't address the two problems above. That adds these fixed set of parameters: -Xloggc:<gc-log-file-loc> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps where gc-log-file-loc is in the same directory and name as the .out file put with "-gc.log" instead of ".out".

like image 534
Jim Hoagland Avatar asked Dec 18 '22 13:12

Jim Hoagland


1 Answers

You can set KAFKA_GC_LOG_OPTS with the specific JVM parameters below before running kafka-server-start.sh. This works because kafka-run-class.sh specifically includes the contents of that environment variable in JVM options, but only if it is passed -loggc on its command line; kafka-server-start.sh does pass this.

If you are starting Kafka through Apache Ambari, you would set KAFKA_GC_LOG_OPTS in Kafka service > Configs > Advanced kafka-env > kafka-env template. If you set it here, it will only be used for kafka-server-start.sh it seems. The other scripts do not currently pass -loggc to kafka-run-class.sh.

Now lets discuss the JVM parameters to include in KAFKA_GC_LOG_OPTS.

To enable GC logging to a file, you will need to add -verbose:gc -Xloggc:<log-file-location>.

You need to give the log file name special consideration to prevent overwrites whenever the broker is restarted. It seems like you need to have a unique name for every invocation so appending a timestamp seems like the best option. You can add something like `date +'%Y%m%d%H%M'` to add a timestamp. In this example, it is in the form of YYYYMMDDHHMM. In some versions of Java you can put "%t" in your log file location and it will be replaced by the broker start up timestamp formatted as YYYY-MM-DD_HH-MM-SS.

Now onto managing use of disk space. I'll be happy if there is a simpler way than what I have.

First, take advantage of Java's built-in GC log file rotation. -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M is an example of enabling this rotation, having up to 10 GC log files from the JVM, each of which is no more than 100MB in size. 10 x 100MB is 1000MB max usage.

With the GC log file rotation in place with up to 10 files, '.0', '.1', ... '.9' will be added to the file name you gave in Xloggc. .0 will be first and after it reaches .9 it will replace .0 and continue on in a round robin manner. In some versions of Java '.current' will be additionally put on the end of the name of the log file currently being written to.

Due to the unique file naming we apparently have to have to avoid overwrites, you can have 1000MB per broker process invocation, so this is not a total solution to managing disk space used by Kafka broker GC logs. You will end up with a set of up to 10 GC log files for each broker -- this can add up over time. The best solution (under *nix) to that would seem to be to use the logrotate utility (or some other utility) to periodically clean up broker GC logs that have not been modified in the last N days.

Be sure to do the math and make sure you will have enough disk space.

People frequently want more details and context in their GC logs than the default, so consider adding in -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps like it done with GC_LOG_ENABLED=true.

Putting all end parameters together into KAFKA_GC_LOG_OPTS and starting a broker you might have:
TIMESTAMP=`date +'%Y%m%d%H%M'`
# GC log location/name prior to .n addition by log rotation
GC_LOG_NAME="{{kafka_log_dir}}/kafka-broker-gc.log-$TIMESTAMP"

GC_LOG_ENABLE_OPTS="-verbose:gc -Xloggc:$GC_LOG_NAME"
GC_LOG_ROTATION_OPTS="-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M"
GC_LOG_FORMAT_OPTS="-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps"

export KAFKA_GC_LOG_OPTS="$GC_LOG_ENABLE_OPTS $GC_LOG_ROTATION_OPTS $GC_LOG_FORMAT_OPTS"
./kafka-server-start.sh server.properties

From the command line, replace {{kafka_log_dir}} with the location of your Kafka log directory or wherever want the GC logs to go. You can change the log file naming too.

Under Ambari, you would add those lines (but not running kafka-server-start.sh) to "kafka-env template" field in the Ambari UI. {{kafka_log_dir}} will be automatically replaced with the Kafka log directory, defined shortly above the field. You'll need to restart the brokers to start the brokers logging (consider doing a rolling upgrade).

like image 76
Jim Hoagland Avatar answered Jan 19 '23 00:01

Jim Hoagland