Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

jvm deoptimization takes CPU when joda-time hits leap day

Tags:

jvm

jit

jodatime

After 2/28/2020, we found our Elasticsearch clusters used to process logs(which had date fields) were under-performing.

It's HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.3+12-LTS) that we have been using to run Elasticsearch clusters.

jstack & JVM deoptimization log led us to deoptimization(reason=unstable_if) of joda-time's leap year check:

stacktrace

jstack shows elasticsearch's [bulk] threads runnable, but actually they are blocked (according to gstack) because of JVM deoptimization at BasicGJChronology.java:185 and BasicGJChronology.java:187。Both of the two line are leap year check.

jstack:

"elasticsearch[xxx][bulk][T#19]" #312 daemon prio=5 os_prio=0 cpu=1972224606.34ms elapsed=12348671.96s tid=0x00007f90a0006800 nid=0xdd41 runnable  [0x00007f8f170ee000]
   java.lang.Thread.State: RUNNABLE
    at org.joda.time.chrono.BasicGJChronology.setYear(BasicGJChronology.java:185)
    at org.joda.time.chrono.BasicYearDateTimeField.setExtended(BasicYearDateTimeField.java:92)
    at org.joda.time.format.DateTimeParserBucket$SavedField.set(DateTimeParserBucket.java:568)
    at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:451)
    at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:182)
    at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826)
    at org.elasticsearch.index.mapper.DateFieldMapper$DateFieldType.parse(DateFieldMapper.java:248)
    at org.elasticsearch.index.mapper.DateFieldMapper.parseCreateField(DateFieldMapper.java:456)
    at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:297)
--
"elasticsearch[xxx][bulk][T#20]" #315 daemon prio=5 os_prio=0 cpu=1975796442.21ms elapsed=12348662.71s tid=0x00007f92f4048000 nid=0xdd83 runnable  [0x00007f8f16cea000]
   java.lang.Thread.State: RUNNABLE
    at org.joda.time.chrono.BasicGJChronology.setYear(BasicGJChronology.java:187)
    at org.joda.time.chrono.BasicYearDateTimeField.setExtended(BasicYearDateTimeField.java:92)
    at org.joda.time.format.DateTimeParserBucket$SavedField.set(DateTimeParserBucket.java:568)
    at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:451)
    at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:182)
    at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826)
    at org.elasticsearch.index.mapper.DateFieldMapper$DateFieldType.parse(DateFieldMapper.java:248)
    at org.elasticsearch.index.mapper.DateFieldMapper.parseCreateField(DateFieldMapper.java:456)
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:297)

gstack of above bulk threads:

#0  0x00007f3cf8f566d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f3cf7ef8ef3 in os::PlatformEvent::park() () from /usr/java/jdk-11.0.3/lib/server/libjvm.so
#2  0x00007f3cf7e9feb8 in Monitor::lock_without_safepoint_check() () from /usr/java/jdk-11.0.3/lib/server/libjvm.so
#3  0x00007f3cf79977f3 in Events::log(Thread*, char const*, ...) () from /usr/java/jdk-11.0.3/lib/server/libjvm.so
#4  0x00007f3cf799411a in Deoptimization::uncommon_trap_inner(JavaThread*, int) () from /usr/java/jdk-11.0.3/lib/server/libjvm.so
#5  0x00007f3cf79973d2 in Deoptimization::uncommon_trap(JavaThread*, int, int) () from /usr/java/jdk-11.0.3/lib/server/libjvm.so

code snippets from BasicGJChronology.java:

    long setYear(long instant, int year) {
        int thisYear = getYear(instant);
        int dayOfYear = getDayOfYear(instant, thisYear);
        int millisOfDay = getMillisOfDay(instant);

        if (dayOfYear > (31 + 28)) { // after Feb 28
            if (isLeapYear(thisYear)) { // LINE 185
                // Current date is Feb 29 or later.
                if (!isLeapYear(year)) { // LINE 187
                    // Moving to a non-leap year, Feb 29 does not exist.
                    dayOfYear--;
                }
            } else {
                // Current date is Mar 01 or later.
                if (isLeapYear(year)) {
                    // Moving to a leap year, account for Feb 29.
                    dayOfYear++;
                }
            }
        }
...

Flame graph

From the left peak on the graph you may find deoptimization.

enter image description here

deoptimization log in hs_err_pidxxx.log

Deoptimization events (10 events):
Event: 12413938.027 Thread 0x00007f6174013000 Uncommon trap: reason=unstable_if action=none pc=0x00007f65f543ff88 method=org.joda.time.chrono.BasicGJChronology.setYear(JI)J @ 36 c2
Event: 12413938.027 Thread 0x00007f6174013000 Uncommon trap: reason=unstable_if action=none pc=0x00007f65f543ff88 method=org.joda.time.chrono.BasicGJChronology.setYear(JI)J @ 36 c2

questions

  1. How can we mitigate the deoptimization impact on performance? Seems it's not the cost of re-JIT compiling but the pthread_cond_wait of logging that impacts most?

  2. The clusters had been up for more than a year. Will a rolling restart solve the unstable_if?

  3. Why is it a unstable_if? Sorry I have no knowledge about HotSpot's profile-based performance tactics. But my gut feeling is that the condition should be very stable after leap day (still unstable at 3/5/2020). 12/31/2020 also worries me since I have no idea about the unstable_if.

like image 213
ZHAO Bo Avatar asked Mar 05 '20 08:03

ZHAO Bo


1 Answers

Someone suggested using -XX:PerMethodRecompilationCutoff=10000 -XX:PerBytecodeRecompilationCutoff=10000 in another forum.

And he thought instead of Action_reinterpret, Action_none (perhaps after some reinterpret) is the problem.

He also mentioned this bug might be related but I haven't look into: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8227523

Sorry I didn't paste the original answer since it's not in English.

like image 144
ZHAO Bo Avatar answered Oct 23 '22 14:10

ZHAO Bo