Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java GC log is full of weird characters

I have an issue on a few servers with the GC log. It's filled with this:

^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@

Noticed this occurs on servers with a large memory given to the JVM: -Xms32G -Xmx48G. This might be a red herring though but thought to mention it.

Since these are low latency/high throughput apps analyzing the log is vital. But instead, it's full of those characters above.

We are using Java 8:

java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)

We use this to create the log:

-verbose:gc
-Xloggc:/path/to/gc.log
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps

Anyone seen this issue before? What might cause it?

like image 598
bdem Avatar asked Oct 07 '15 07:10

bdem


1 Answers

TL;DR

Don't use logrotate (or any 3rd party rotation) to rotate JVM GC logs. It's behaviour does not play nicely with the way the JVM writes the GC log file. The JVM is capable of rotating it's own GC logs, using JVM flags:

  • -XX:+UseGCLogFileRotation Enables GC log file rotation
  • -XX:NumberOfGCLogFiles=5 Will tell the JVM to keep 5 rotated files
  • -XX:GCLogFileSize=20M Will tell the JVM to rotate when the file reaches 20M

The problem

For us, this was because logrotate and the JVM were both trying to write the file without a lock. JVM garbage collection logs appear to be special insofar as they're written directly from the JVM itself into the file. What happens is that the JVM keeps a handle to this file, and a location within it where it writes logs to.

^@ is actually just a null byte in your file. You can see this if you run hexdump -C your_gc.log. What causes these null bytes is the interesting part - logrotate truncating the file.

$ hexdump -C gc.log | head -3
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
061ca010  00 00 00 00 00 00 00 32  30 32 30 2d 30 37 2d 30  |.......2020-07-0|

This only surfaced for us because we were using Logstash to monitor the GC log. Logstash was crashing with OutOfMemoryError every time logrotate ran, and from inspecting heap dumps, we noticed that logstash was trying to ship a HUGE (600MB in JVM internal memory) log line that looked like:

{ "message": "\u0000\u0000\u0000...

In this case, because logstash escapes the nulls to unicode (6 characters), and each character is represented as UTF-16 internally in the JVM, this means the on-heap encoding of this is a whopping factor of 12 larger than the null bytes on disk. So it takes a smaller log than you would expect to run out of memory.

This lead us to find the nulls in the garbage collection log, and where they came from:

1. JVM happily writing logs

*-------------------------*
^                         ^
JVM's file start          JVM's current location

2. logrotate has entered the game

                         **
\________________________/
^                    |    ^
JVM's file start     |    JVM's current location
                     |
                     logrotate copies contents elsewhere and truncates file
                     to zero-length

3. JVM keeps writing

*xxxxxxxxxxxxxxxxxxxxxxxxx-*
\________________________/^^
^                    |    |JVM's current location
JVM's file start     |    JVM writes new log
                     |
                     File is now zero-length, but JVM still tries to write
                     to the end, so everything before it's pointer is 
                     filled in with zeros
like image 126
Will Faithfull Avatar answered Sep 27 '22 21:09

Will Faithfull