Can someone explain why JMH saying that returning 1 is faster than returning 0 ?
Here is the benchmark code.
import org.openjdk.jmh.annotations.*;
import java.util.concurrent.TimeUnit;
@State(Scope.Thread)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = 3, jvmArgsAppend = {"-server", "-disablesystemassertions"})
public class ZeroVsOneBenchmark {
@Benchmark
@Warmup(iterations = 3, time = 2, timeUnit = TimeUnit.SECONDS)
public int zero() {
return 0;
}
@Benchmark
@Warmup(iterations = 3, time = 2, timeUnit = TimeUnit.SECONDS)
public int one() {
return 1;
}
}
Here is the result:
# Run complete. Total time: 00:03:05
Benchmark Mode Samples Score Score error Units
c.m.ZeroVsOneBenchmark.one thrpt 60 1680674.502 24113.014 ops/ms
c.m.ZeroVsOneBenchmark.zero thrpt 60 735975.568 14779.380 ops/ms
The same behaviour for one, two and zero
# Run complete. Total time: 01:01:56
Benchmark Mode Samples Score Score error Units
c.m.ZeroVsOneBenchmark.one thrpt 90 1762956.470 7554.807 ops/ms
c.m.ZeroVsOneBenchmark.two thrpt 90 1764642.299 9277.673 ops/ms
c.m.ZeroVsOneBenchmark.zero thrpt 90 773010.467 5031.920 ops/ms
JMH is a good tool but still not perfect.
Certainly there is no speed difference between returning 0, 1 or any other integer. However it makes difference how the value is consumed by JMH and how this is compiled by HotSpot JIT.
To prevent JIT from optimizing out calculations, JMH uses the special Blackhole class to consume values returned from a benchmark. Here is a one for integer values:
public final void consume(int i) { if (i == i1 & i == i2) { // SHOULD NEVER HAPPEN nullBait.i1 = i; // implicit null pointer exception } }
Here i
is a value returned from a benchmark. In your case it is either 0 or 1. When i == 1
the never-happen condition looks like if (1 == i1 & 1 == i2)
which is compiled as follows:
0x0000000002b4ffe5: mov 0xb0(%r13),%r10d ;*getfield i1 0x0000000002b4ffec: mov 0xb4(%r13),%r8d ;*getfield i2 0x0000000002b4fff3: cmp $0x1,%r8d 0x0000000002b4fff7: je 0x0000000002b50091 ;*return
But when i == 0
JIT tries to "optimize" two comparisions to 0
using setne
instructions. However the result code becomes too complicated:
0x0000000002a40b28: mov 0xb0(%rdi),%r10d ;*getfield i1 0x0000000002a40b2f: mov 0xb4(%rdi),%r8d ;*getfield i2 0x0000000002a40b36: test %r10d,%r10d 0x0000000002a40b39: setne %r10b 0x0000000002a40b3d: movzbl %r10b,%r10d 0x0000000002a40b41: test %r8d,%r8d 0x0000000002a40b44: setne %r11b 0x0000000002a40b48: movzbl %r11b,%r11d 0x0000000002a40b4c: xor $0x1,%r10d 0x0000000002a40b50: xor $0x1,%r11d 0x0000000002a40b54: and %r11d,%r10d 0x0000000002a40b57: test %r10d,%r10d 0x0000000002a40b5a: jne 0x0000000002a40c15 ;*return
That is, slower return 0
is explained by more CPU instructions executed in Blackhole.consume()
.
Note to JMH developers: I would suggest rewriting Blackhole.consume
like
if (i == l1) { // SHOULD NEVER HAPPEN nullBait.i1 = i; // implicit null pointer exception }
where volatile long l1 = Long.MIN_VALUE
. In this case the condition will still be always-false but it will be compiled equally for all return values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With