At vJUG24, one of the topics was JVM performance.
Slides can be found here.
He had an example:
static void log(Object... args) {
for(Object arg : args) {
System.out.println(arg);
}
}
which was called via (can't quite read the slide properly, but it's similar):
void doSomething() {
log("foo", 4, new Object());
}
He said because it was a static method, it could be optimised by inlining it like this:
void doSomething() {
System.out.println("foo");
System.out.println(new Integer(4).toString());
System.out.println(new Object().toString());
}
Why is it important that the log method is static for the JVM to make this optimisation?
Either the presentation was not quite precise, or you did not get it right.
In fact, JVM can inline non-static methods, even with varargs. Moreover, it can eliminate allocation of corresponding Object[]
array in certain cases. Unfortunately, it does not do this when a vararg method iterates over the array using for
loop.
I made the following JMH benchmark to verify the theory and ran it with GC profiler (-prof gc
).
package bench;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.infra.Blackhole;
public class VarArgs {
@Benchmark
public void inlineNonStatic(Blackhole bh) {
inlineNonStaticVA(bh, "foo", 4, new Object());
}
@Benchmark
public void inlineStatic(Blackhole bh) {
inlineStaticVA(bh, "foo", 4, new Object());
}
@Benchmark
public void loopNonStatic(Blackhole bh) {
loopNonStaticVA(bh, "foo", 4, new Object());
}
@Benchmark
public void loopStatic(Blackhole bh) {
loopStaticVA(bh, "foo", 4, new Object());
}
public void inlineNonStaticVA(Blackhole bh, Object... args) {
if (args.length > 0) bh.consume(args[0]);
if (args.length > 1) bh.consume(args[1]);
if (args.length > 2) bh.consume(args[2]);
if (args.length > 3) bh.consume(args[3]);
}
public static void inlineStaticVA(Blackhole bh, Object... args) {
if (args.length > 0) bh.consume(args[0]);
if (args.length > 1) bh.consume(args[1]);
if (args.length > 2) bh.consume(args[2]);
if (args.length > 3) bh.consume(args[3]);
}
public void loopNonStaticVA(Blackhole bh, Object... args) {
for (Object arg : args) {
bh.consume(arg);
}
}
public static void loopStaticVA(Blackhole bh, Object... args) {
for (Object arg : args) {
bh.consume(arg);
}
}
}
-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
shows that all 4 variants are successfully inlined into the caller:
@ 28 bench.VarArgs::inlineNonStaticVA (52 bytes) inline (hot)
@ 27 bench.VarArgs::inlineStaticVA (52 bytes) inline (hot)
@ 28 bench.VarArgs::loopNonStaticVA (35 bytes) inline (hot)
@ 27 bench.VarArgs::loopStaticVA (33 bytes) inline (hot)
The results confirm that there is no performance difference between calling static vs. non-static methods.
Benchmark Mode Cnt Score Error Units
VarArgs.inlineNonStatic avgt 20 9,606 ± 0,076 ns/op
VarArgs.inlineStatic avgt 20 9,604 ± 0,040 ns/op
VarArgs.loopNonStatic avgt 20 14,188 ± 0,154 ns/op
VarArgs.loopStatic avgt 20 14,147 ± 0,059 ns/op
However, GC profiler indicates that vararg Object[]
array is allocated for loop*
methods, but not for inline*
methods.
Benchmark Mode Cnt Score Error Units
VarArgs.inlineNonStatic:·gc.alloc.rate.norm avgt 20 16,000 ± 0,001 B/op
VarArgs.inlineStatic:·gc.alloc.rate.norm avgt 20 16,000 ± 0,001 B/op
VarArgs.loopNonStatic:·gc.alloc.rate.norm avgt 20 48,000 ± 0,001 B/op
VarArgs.loopStatic:·gc.alloc.rate.norm avgt 20 48,000 ± 0,001 B/op
I guess, the original point was that static methods are always monomorphic. However, JVM can also inline polymorphic methods if there are not too many actual receivers in the particular call site.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With