I was comparing the performance of JDK 8 and 11 using jmh 1.21 when I ran across some surprising numbers:
Java version: 1.8.0_192, vendor: Oracle Corporation Benchmark Mode Cnt Score Error Units MyBenchmark.throwAndConsumeStacktrace avgt 25 21525.584 ± 58.957 ns/op Java version: 9.0.4, vendor: Oracle Corporation Benchmark Mode Cnt Score Error Units MyBenchmark.throwAndConsumeStacktrace avgt 25 28243.899 ± 498.173 ns/op Java version: 10.0.2, vendor: Oracle Corporation Benchmark Mode Cnt Score Error Units MyBenchmark.throwAndConsumeStacktrace avgt 25 28499.736 ± 215.837 ns/op Java version: 11.0.1, vendor: Oracle Corporation Benchmark Mode Cnt Score Error Units MyBenchmark.throwAndConsumeStacktrace avgt 25 48535.766 ± 2175.753 ns/op
OpenJDK 11 and 12 perform similar to OracleJDK 11. I have omitted their numbers for the sake of brevity.
I understand that microbenchmarks do not indicate the performance behavior of real-life applications. Still, I'm curious where this difference is coming from. Any ideas?
Here is the benchmark in its entirety:
pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>jmh</groupId> <artifactId>consume-stacktrace</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <name>JMH benchmark sample: Java</name> <dependencies> <dependency> <groupId>org.openjdk.jmh</groupId> <artifactId>jmh-core</artifactId> <version>${jmh.version}</version> </dependency> <dependency> <groupId>org.openjdk.jmh</groupId> <artifactId>jmh-generator-annprocess</artifactId> <version>${jmh.version}</version> <scope>provided</scope> </dependency> </dependencies> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <jmh.version>1.21</jmh.version> <javac.target>1.8</javac.target> <uberjar.name>benchmarks</uberjar.name> </properties> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-enforcer-plugin</artifactId> <version>1.4.1</version> <executions> <execution> <id>enforce-versions</id> <goals> <goal>enforce</goal> </goals> <configuration> <rules> <requireMavenVersion> <version>3.0</version> </requireMavenVersion> </rules> </configuration> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.8.0</version> <configuration> <compilerVersion>${javac.target}</compilerVersion> <source>${javac.target}</source> <target>${javac.target}</target> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.2.1</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <finalName>${uberjar.name}</finalName> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>org.openjdk.jmh.Main</mainClass> </transformer> </transformers> <filters> <filter> <!-- Shading signed JARs will fail without this. http://stackoverflow.com/questions/999489/invalid-signature-file-when-attempting-to-run-a-jar --> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> </execution> </executions> </plugin> </plugins> <pluginManagement> <plugins> <plugin> <artifactId>maven-clean-plugin</artifactId> <version>2.6.1</version> </plugin> <plugin> <artifactId>maven-deploy-plugin</artifactId> <version>2.8.2</version> </plugin> <plugin> <artifactId>maven-install-plugin</artifactId> <version>2.5.2</version> </plugin> <plugin> <artifactId>maven-jar-plugin</artifactId> <version>3.1.0</version> </plugin> <plugin> <artifactId>maven-javadoc-plugin</artifactId> <version>3.0.0</version> </plugin> <plugin> <artifactId>maven-resources-plugin</artifactId> <version>3.1.0</version> </plugin> <plugin> <artifactId>maven-site-plugin</artifactId> <version>3.7.1</version> </plugin> <plugin> <artifactId>maven-source-plugin</artifactId> <version>3.0.1</version> </plugin> <plugin> <artifactId>maven-surefire-plugin</artifactId> <version>2.22.0</version> </plugin> </plugins> </pluginManagement> </build> </project>
src/main/java/jmh/MyBenchmark.java:
package jmh; import org.openjdk.jmh.annotations.Benchmark; import org.openjdk.jmh.annotations.BenchmarkMode; import org.openjdk.jmh.annotations.Mode; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.infra.Blackhole; import java.io.PrintWriter; import java.io.StringWriter; import java.util.concurrent.TimeUnit; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) public class MyBenchmark { @Benchmark public void throwAndConsumeStacktrace(Blackhole bh) { try { throw new IllegalArgumentException("I love benchmarks"); } catch (IllegalArgumentException e) { StringWriter sw = new StringWriter(); e.printStackTrace(new PrintWriter(sw)); bh.consume(sw.toString()); } } }
Here is the Windows-specific script I use. It should be trivial to translate it to other platforms:
set JAVA_HOME=C:\Program Files\Java\jdk1.8.0_192 call mvn -V -Djavac.target=1.8 clean install "%JAVA_HOME%\bin\java" -jar target\benchmarks.jar set JAVA_HOME=C:\Program Files\Java\jdk-9.0.4 call mvn -V -Djavac.target=9 clean install "%JAVA_HOME%\bin\java" -jar target\benchmarks.jar set JAVA_HOME=C:\Program Files\Java\jdk-10.0.2 call mvn -V -Djavac.target=10 clean install "%JAVA_HOME%\bin\java" -jar target\benchmarks.jar set JAVA_HOME=C:\Program Files\Java\oracle-11.0.1 call mvn -V -Djavac.target=11 clean install "%JAVA_HOME%\bin\java" -jar target\benchmarks.jar
My runtime environment is:
Apache Maven 3.6.0 (97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T14:41:47-04:00) Maven home: C:\Program Files\apache-maven-3.6.0\bin\.. Default locale: en_CA, platform encoding: Cp1252 OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
More specifically, I am running Microsoft Windows [Version 10.0.17763.195]
.
Almost every data set improves on Java 11 over Java 8 using the G1 garbage collector. On average, there's a 16% improvement just by switching to Java 11.
It is an open-source reference implementation of Java SE platform version 11. Java 11 was released after four years of releasing Java 8. Java 11 comes with new features to provide more functionality. Below are the features which are added in the four and a half years in between these two versions.
There are several reasons why one should upgrade from Java 8 to Java 11. Applications written in Java 9, 10, and 11 are significantly faster and more secure than previous versions of the language. ZGC and Epsilon garbage collectors have improved Garbage Collection.
Why is Java 11 important? Java 11 is the second LTS release after Java 8. Since Java 11, Oracle JDK would no longer be free for commercial use. You can use it in developing stages but to use it commercially, you need to buy a license.
I investigated the issue with async-profiler which can draw cool flame graphs demonstrating where the CPU time is spent.
As @AlekseyShipilev pointed out, the slowdown between JDK 8 and JDK 9 is mainly the result of StackWalker changes. Also G1 has become the default GC since JDK 9. If we explicitly set -XX:+UseParallelGC
(default in JDK 8), the scores will be slightly better.
But the most interesting part is the slowdown in JDK 11.
Here is what async-profiler shows (clickable SVG).
The main difference between two profiles is in the size of java_lang_Throwable::get_stack_trace_elements
block, which is dominated by StringTable::intern
. Apparently StringTable::intern
takes much longer on JDK 11.
Let's zoom in:
Note that StringTable::intern
in JDK 11 calls do_intern
which in turn allocates a new java.lang.String
object. Looks suspicious. Nothing of this kind is seen in JDK 10 profile. Time to look in the source code.
stringTable.cpp (JDK 11)
oop StringTable::intern(Handle string_or_null_h, jchar* name, int len, TRAPS) { // shared table always uses java_lang_String::hash_code unsigned int hash = java_lang_String::hash_code(name, len); oop found_string = StringTable::the_table()->lookup_shared(name, len, hash); if (found_string != NULL) { return found_string; } if (StringTable::_alt_hash) { hash = hash_string(name, len, true); } return StringTable::the_table()->do_intern(string_or_null_h, name, len, | hash, CHECK_NULL); } | ---------------- | v oop StringTable::do_intern(Handle string_or_null_h, const jchar* name, int len, uintx hash, TRAPS) { HandleMark hm(THREAD); // cleanup strings created Handle string_h; if (!string_or_null_h.is_null()) { string_h = string_or_null_h; } else { string_h = java_lang_String::create_from_unicode(name, len, CHECK_NULL); }
The function in JDK 11 first looks for a string in the shared StringTable, does not find it, then goes to do_intern
and immediately creates a new String object.
In JDK 10 sources after a call to lookup_shared
there was an additional lookup in the main table which returned the existing string without creation of a new object:
found_string = the_table()->lookup_in_main_table(index, name, len, hashValue);
This refactoring was a result of JDK-8195097 "Make it possible to process StringTable outside safepoint".
TL;DR While interning method names in JDK 11, HotSpot creates redundant String objects. This has happened after JDK-8195097.
I suspect this is due to several changes.
8->9 regression happened while switching to StackWalker for generating the stack traces (JDK-8150778). Unfortunately, this made VM native code intern a lot of strings, and StringTable becomes the bottleneck. If you profile OP's benchmark, you will see the profile like in JDK-8151751. It should be enough to perf record -g
the entire JVM that runs the benchmark, and then look into perf report
. (Hint, hint, you can do it yourself next time!)
And 10->11 regression must have happened later. I suspect this is due to StringTable preparations for switching to fully concurrent hash table (JDK-8195100, which, as Claes points out, is not entirely in 11) or something else (class data sharing changes?).
Either way, interning on fast path is a bad idea, and patch for JDK-8151751 should have dealt with both regressions.
Watch this:
8u191: 15108 ± 99 ns/op [so far so good]
- 54.55% 0.37% java libjvm.so [.] JVM_GetStackTraceElement - 54.18% JVM_GetStackTraceElement - 52.22% java_lang_Throwable::get_stack_trace_element - 48.23% java_lang_StackTraceElement::create - 17.82% StringTable::intern - 13.92% StringTable::intern - 4.83% Klass::external_name + 3.41% Method::line_number_from_bci
"head": 22382 ± 134 ns/op [regression]
- 69.79% 0.05% org.sample.MyBe libjvm.so [.] JVM_InitStackTraceElement - 69.73% JVM_InitStackTraceElementArray - 69.14% java_lang_Throwable::get_stack_trace_elements - 66.86% java_lang_StackTraceElement::fill_in - 38.48% StringTable::intern - 21.81% StringTable::intern - 2.21% Klass::external_name 1.82% Method::line_number_from_bci 0.97% AccessInternal::PostRuntimeDispatch<G1BarrierSet::AccessBarrier<573
"head" + JDK-8151751 patch: 7511 ± 26 ns/op [woot, even better than 8u]
- 22.53% 0.12% org.sample.MyBe libjvm.so [.] JVM_InitStackTraceElement - 22.40% JVM_InitStackTraceElementArray - 20.25% java_lang_Throwable::get_stack_trace_elements - 12.69% java_lang_StackTraceElement::fill_in + 6.86% Method::line_number_from_bci 2.08% AccessInternal::PostRuntimeDispatch<G1BarrierSet::AccessBarrier 2.24% InstanceKlass::method_with_orig_idnum 1.03% Handle::Handle
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With