Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lambda performance improvement, Java 8 vs 11

I ran some JMH-tests on lambda vs method reference, looking similar to:

IntStream......reduce(Integer::max)
vs.
IntSream.......reduce((i1, i2) -> Integer.max(i1, i2))

What I noticed was that the method reference performed about 5 times as fast as compared to the lambda, in Java 8. When i ran the test in Java 11 the execution time of the both approaches were about as fast as the method reference was in Java 8. So no major difference in performance between lambda and method reference in Java 11.

My question is: What improvement(s) have been made from Java 8 to 11 to boost this performance? I'm using OpenJDK.

EDIT My benchmark:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Fork(value = 1, jvmArgs = {"-XX:CompileThreshold=5000"})
@Warmup(iterations = 2)
public class FindMaxInt {

@Param({"10000", "1000000", "10000000"})
private int n;

private List<Integer> data;

@Setup
public void setup(){
    data = createData();
}

@Benchmark
public void streamWithMethodReference(final Blackhole blackhole){
    int max = data.stream().mapToInt(Integer::intValue).reduce(Integer.MIN_VALUE, Integer::max);
    blackhole.consume(max);
}

@Benchmark
public void streamWithLambda(final Blackhole blackhole){
    int max = data.stream().mapToInt(Integer::intValue).reduce(Integer.MIN_VALUE, (i1, i2) -> Integer.max(i1, i2));
    blackhole.consume(max);
}
like image 342
Johan Wiström Avatar asked Mar 02 '19 11:03

Johan Wiström


1 Answers

Here is a combination of effects described in this and this answers.

Different results are explained by a different inlining tree. Lambda has one more level of indirection comparing to method reference, so during JIT compilation the expression with lambda may reach the inlining depth limit earlier. The default is -XX:MaxInlineLevel=9.

Run the benchmark with -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining to see the the whole inlining tree. Here is what we get on JDK 8:

1563  560       4       bench.FindMaxInt::streamWithLambda (38 bytes)
                           @ 3   java.util.stream.IntPipeline::<init> (7 bytes)   inline (hot)
                             @ 3   java.util.stream.AbstractPipeline::<init> (91 bytes)   inline (hot)
                               @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                 @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                               @ 51   java.util.stream.StreamOpFlag::combineOpFlags (9 bytes)   inline (hot)
                                 @ 2   java.util.stream.StreamOpFlag::getMask (30 bytes)   inline (hot)
                               @ 66   java.util.stream.IntPipeline$StatelessOp::opIsStateful (2 bytes)   inline (hot)
                           @ 4   java.util.Collection::stream (11 bytes)   inline (hot)
                            \-> TypeProfile (5120/5120 counts) = java/util/ArrayList
                             @ 1   java.util.ArrayList::spliterator (12 bytes)   inline (hot)
                               @ 8   java.util.ArrayList$ArrayListSpliterator::<init> (26 bytes)   inline (hot)
                                 @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 7   java.util.stream.StreamSupport::stream (19 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 11   java.util.stream.StreamOpFlag::fromCharacteristics (37 bytes)   inline (hot)
                                 @ 1   java.util.ArrayList$ArrayListSpliterator::characteristics (4 bytes)   inline (hot)
                                  \-> TypeProfile (5124/5124 counts) = java/util/ArrayList$ArrayListSpliterator
                               @ 15   java.util.stream.ReferencePipeline$Head::<init> (8 bytes)   inline (hot)
                                 @ 4   java.util.stream.ReferencePipeline::<init> (8 bytes)   inline (hot)
                                   @ 4   java.util.stream.AbstractPipeline::<init> (55 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                           @ 9   java.lang.invoke.LambdaForm$MH/883049899::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/1922154895::identity_L (8 bytes)   force inline by annotation
                           @ 14   java.util.stream.ReferencePipeline::mapToInt (26 bytes)   inline (hot)
                            \-> TypeProfile (5120/5120 counts) = java/util/stream/ReferencePipeline$Head
                             @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                             @ 22   java.util.stream.ReferencePipeline$4::<init> (20 bytes)   inline (hot)
                               @ 16   java.util.stream.IntPipeline$StatelessOp::<init> (29 bytes)   inline (hot)
                                 @ 3   java.util.stream.IntPipeline::<init> (7 bytes)   inline (hot)
                                   @ 3   java.util.stream.AbstractPipeline::<init> (91 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                                     @ 51   java.util.stream.StreamOpFlag::combineOpFlags (9 bytes)   inline (hot)
                                       @ 2   java.util.stream.StreamOpFlag::getMask (30 bytes)   inline (hot)
                                     @ 66   java.util.stream.IntPipeline$StatelessOp::opIsStateful (2 bytes)   inline (hot)
                           @ 21   java.lang.invoke.LambdaForm$MH/883049899::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/1922154895::identity_L (8 bytes)   force inline by annotation
                           @ 26   java.util.stream.IntPipeline::reduce (16 bytes)   inline (hot)
                            \-> TypeProfile (5120/5120 counts) = java/util/stream/ReferencePipeline$4
                             @ 3   java.util.stream.ReduceOps::makeInt (18 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 14   java.util.stream.ReduceOps$5::<init> (16 bytes)   inline (hot)
                                 @ 12   java.util.stream.ReduceOps$ReduceOp::<init> (10 bytes)   inline (hot)
                                   @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 6   java.util.stream.AbstractPipeline::evaluate (94 bytes)   inline (hot)
                               @ 50   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 80   java.util.stream.TerminalOp::getOpFlags (2 bytes)   inline (hot)
                                \-> TypeProfile (5130/5130 counts) = java/util/stream/ReduceOps$5
                               @ 85   java.util.stream.AbstractPipeline::sourceSpliterator (265 bytes)   inline (hot)
                                 @ 79   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 88   java.util.stream.ReduceOps$ReduceOp::evaluateSequential (18 bytes)   inline (hot)
                                 @ 2   java.util.stream.ReduceOps$5::makeSink (5 bytes)   inline (hot)
                                   @ 1   java.util.stream.ReduceOps$5::makeSink (16 bytes)   inline (hot)
                                     @ 12   java.util.stream.ReduceOps$5ReducingSink::<init> (15 bytes)   inline (hot)
                                       @ 11   java.lang.Object::<init> (1 bytes)   inline (hot)
                                 @ 6   java.util.stream.AbstractPipeline::wrapAndCopyInto (18 bytes)   inline (hot)
                                   @ 3   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                   @ 9   java.util.stream.AbstractPipeline::wrapSink (37 bytes)   inline (hot)
                                     @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                     @ 23   java.util.stream.ReferencePipeline$4::opWrapSink (10 bytes)   inline (hot)
                                      \-> TypeProfile (5081/5081 counts) = java/util/stream/ReferencePipeline$4
                                       @ 6   java.util.stream.ReferencePipeline$4$1::<init> (11 bytes)   inline (hot)
                                         @ 7   java.util.stream.Sink$ChainedReference::<init> (16 bytes)   inline (hot)
                                           @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                                           @ 6   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                   @ 13   java.util.stream.AbstractPipeline::copyInto (53 bytes)   inline (hot)
                                     @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                     @ 9   java.util.stream.AbstractPipeline::getStreamAndOpFlags (5 bytes)   accessor
                                     @ 12   java.util.stream.StreamOpFlag::isKnown (19 bytes)   inline (hot)
                                     @ 20   java.util.Spliterator::getExactSizeIfKnown (25 bytes)   inline (hot)
                                      \-> TypeProfile (5081/5081 counts) = java/util/ArrayList$ArrayListSpliterator
                                       @ 1   java.util.ArrayList$ArrayListSpliterator::characteristics (4 bytes)   inline (hot)
                                       @ 19   java.util.ArrayList$ArrayListSpliterator::estimateSize (11 bytes)   inline (hot)
                                         @ 1   java.util.ArrayList$ArrayListSpliterator::getFence (48 bytes)   inline (hot)
                                           @ 38   java.util.ArrayList::access$000 (5 bytes)   accessor
                                     @ 25   java.util.stream.Sink$ChainedReference::begin (11 bytes)   inline (hot)
                                      \-> TypeProfile (5081/5081 counts) = java/util/stream/ReferencePipeline$4$1
                                       @ 5   java.util.stream.ReduceOps$5ReducingSink::begin (9 bytes)   inline (hot)
                                        \-> TypeProfile (5079/5079 counts) = java/util/stream/ReduceOps$5ReducingSink
                                     @ 32   java.util.ArrayList$ArrayListSpliterator::forEachRemaining (129 bytes)   inline (hot)
                                       @ 51   java.util.ArrayList::access$000 (5 bytes)   accessor
                                       @ 99   java.util.stream.ReferencePipeline$4$1::accept (23 bytes)   inline (hot)
                                         @ 12   bench.FindMaxInt$$Lambda$8/390011259::applyAsInt (8 bytes)   inline (hot)
                                          \-> TypeProfile (13752/13752 counts) = bench/FindMaxInt$$Lambda$8
                                           @ 4   java.lang.Integer::intValue (5 bytes)   accessor
                                         @ 17   java.util.stream.ReduceOps$5ReducingSink::accept (19 bytes)   inline (hot)
                                          \-> TypeProfile (13752/13752 counts) = java/util/stream/ReduceOps$5ReducingSink
                                           @ 10   bench.FindMaxInt$$Lambda$9/208515840::applyAsInt (6 bytes)   inline (hot)
                                            \-> TypeProfile (9107/9107 counts) = bench/FindMaxInt$$Lambda$9
                                             @ 2   bench.FindMaxInt::lambda$streamWithLambda$0 (6 bytes)   inline (hot)
                                               @ 2   java.lang.Integer::max (6 bytes)   inlining too deep
                                     @ 38   java.util.stream.Sink$ChainedReference::end (10 bytes)   inline (hot)
                                       @ 4   java.util.stream.Sink::end (1 bytes)   inline (hot)
                                        \-> TypeProfile (5125/5125 counts) = java/util/stream/ReduceOps$5ReducingSink
                                 @ 12   java.util.stream.ReduceOps$5ReducingSink::get (5 bytes)   inline (hot)
                                   @ 1   java.util.stream.ReduceOps$5ReducingSink::get (8 bytes)   inline (hot)
                                     @ 4   java.lang.Integer::valueOf (32 bytes)   inline (hot)
                                       @ 28   java.lang.Integer::<init> (10 bytes)   inline (hot)
                                         @ 1   java.lang.Number::<init> (5 bytes)   inline (hot)
                                           @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 12   java.lang.Integer::intValue (5 bytes)   accessor
                           @ 34   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompilerOracle

The key lines are the following. They mean the inlining breaks exactly at the final call to Integer.max, because the default limit of 9 levels is reached.

@ 2   bench.FindMaxInt::lambda$streamWithLambda$0 (6 bytes)   inline (hot)
  @ 2   java.lang.Integer::max (6 bytes)   inlining too deep

The shape of the inlining tree is very different on JDK 11:

1588  705       4       bench.FindMaxInt::streamWithLambda (38 bytes)
                           @ 4   java.util.Collection::stream (11 bytes)   inline (hot)
                            \-> TypeProfile (5263/5263 counts) = java/util/ArrayList
                             @ 1   java.util.ArrayList::spliterator (12 bytes)   inline (hot)
                               @ 8   java.util.ArrayList$ArrayListSpliterator::<init> (26 bytes)   inline (hot)
                                 @ 6   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 7   java.util.stream.StreamSupport::stream (19 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 11   java.util.stream.StreamOpFlag::fromCharacteristics (37 bytes)   inline (hot)
                                 @ 1   java.util.ArrayList$ArrayListSpliterator::characteristics (4 bytes)   inline (hot)
                                  \-> TypeProfile (5125/5125 counts) = java/util/ArrayList$ArrayListSpliterator
                               @ 15   java.util.stream.ReferencePipeline$Head::<init> (8 bytes)   inline (hot)
                                 @ 4   java.util.stream.ReferencePipeline::<init> (8 bytes)   inline (hot)
                                   @ 4   java.util.stream.AbstractPipeline::<init> (55 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                           @ 9   java.lang.invoke.Invokers$Holder::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/0x0000000800060440::invoke (8 bytes)   force inline by annotation
                           @ 14   java.util.stream.ReferencePipeline::mapToInt (26 bytes)   inline (hot)
                            \-> TypeProfile (5263/5263 counts) = java/util/stream/ReferencePipeline$Head
                             @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                             @ 22   java.util.stream.ReferencePipeline$4::<init> (20 bytes)   inline (hot)
                               @ 16   java.util.stream.IntPipeline$StatelessOp::<init> (29 bytes)   inline (hot)
                                 @ 3   java.util.stream.IntPipeline::<init> (7 bytes)   inline (hot)
                                   @ 3   java.util.stream.AbstractPipeline::<init> (91 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                                     @ 51   java.util.stream.StreamOpFlag::combineOpFlags (9 bytes)   inline (hot)
                                       @ 2   java.util.stream.StreamOpFlag::getMask (30 bytes)   inline (hot)
                                     @ 66   java.util.stream.IntPipeline$StatelessOp::opIsStateful (2 bytes)   inline (hot)
                           @ 21   java.lang.invoke.Invokers$Holder::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/0x0000000800060440::invoke (8 bytes)   force inline by annotation
                           @ 26   java.util.stream.IntPipeline::reduce (16 bytes)   inline (hot)
                            \-> TypeProfile (5263/5263 counts) = java/util/stream/ReferencePipeline$4
                             @ 3   java.util.stream.ReduceOps::makeInt (18 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 14   java.util.stream.ReduceOps$6::<init> (16 bytes)   inline (hot)
                                 @ 12   java.util.stream.ReduceOps$ReduceOp::<init> (10 bytes)   inline (hot)
                                   @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 6   java.util.stream.AbstractPipeline::evaluate (94 bytes)   inline (hot)
                               @ 50   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 80   java.util.stream.TerminalOp::getOpFlags (2 bytes)   inline (hot)
                                \-> TypeProfile (5362/5362 counts) = java/util/stream/ReduceOps$6
                               @ 85   java.util.stream.AbstractPipeline::sourceSpliterator (265 bytes)   inline (hot)
                                 @ 79   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 88   java.util.stream.ReduceOps$ReduceOp::evaluateSequential (18 bytes)   already compiled into a big method
                             @ 12   java.lang.Integer::intValue (5 bytes)   accessor
                           @ 34   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompileCommand

The compilation tree cuts off much earlier due to a different reason:

@ 88   java.util.stream.ReduceOps$ReduceOp::evaluateSequential (18 bytes)   already compiled into a big method

The default garbage collector has changed to G1 in JDK 11. The compiled code appears larger due to G1 barriers, that's why the inlining heuristics prevented the hottest forEachRemaining loop from inlining into the streamWithLambda method.

In fact, this is not an optimization in JDK 11, but more like the other way round. However, the overall performance of this particular benchmark appeared better, since the inlining tree cutoff happened outside the hottest loop.

Inlining tree

like image 75
apangin Avatar answered Oct 17 '22 10:10

apangin