Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of Collections.emptyList and empty ArrayList with JIT compiler

Is there a performance difference between using Collections.emptyList() or an empty ArrayList, especially when using a JIT compiler?

I could imagine that - for example - the JIT compiler doesn't do inlining or static method calls because the executed method depends on the type.

Edit I know that Collections.emptyList() returns an immutable list while ArrayList is mutable object.

What I mean is that if I pass one or the other to a method as a parameter and the method doesn't modify the list does that restrict the possibilities for the JIT compiler to optimize the method?

A simple example (just to clarify what I mean):

int sum(List<Integer> list)
{
    int sum = 0;

    for(int i=0;i<list.size();++i)
      sum += list.get(i);

    return sum;
}

If I only called this method with ArrayList the JIT compiler could inline ArrayList.get(). If I also made calls with Collections.empty() it wouldn't be possible.

Is that correct?

like image 871
Jimmy R.T. Avatar asked Oct 24 '15 11:10

Jimmy R.T.


2 Answers

Disclamer

All that is written below applies only to HotSpot JVM.

Short Answers

the JIT compiler doesn't do inlining or static method calls because the executed method depends on the type.

This is opposite of true. See my answer.

Is there a performance difference between using Collections.emptyList() or an empty ArrayList, especially when using a JIT compiler?

In rare cases - yes. See microbenchmark results.

If I only called this method with ArrayList the JIT compiler could inline ArrayList.get(). If I also made calls with Collections.empty() it wouldn't be possible. Is that correct?

The short answer - it depends. JIT compiler is smart enough to recognize monomorphic, bimorphic and polimorphic call patterns and provide appropriate implementations.

Answer

In order to get a detailed answer I would recommend to read the following post about black magic of method dispatching. In few words

C2 does an interesting profile-guided optimization based on the observed type profile. If there is only a single receiver type (that is, the call site is monomorphic), it can simply check for the predicted type, and inline the target directly. The same optimization can and will be applied if there are two receiver types observed (that is, the call site is bimorphic), at the cost of two branches.

Let's consider the following JMH example(if you haven’t learned about JMH then I suggest to read about it here).

@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 5)
public class ExampleBench {

    @Param("10000")
    private int count;

    List<Integer>[] arrays;
    List<Integer>[] empty;
    List<Integer>[] bimorphic;
    List<Integer>[] polimorphic;

    @Setup
    public void setup(){
        Random r = new Random(0xBAD_BEEF);
        arrays = new List[count];
        empty = new List[count];
        bimorphic = new List[count];
        polimorphic = new List[count];
        for (int i = 0; i < arrays.length; i++) {
            bimorphic[i] = r.nextBoolean() ? new ArrayList<Integer>(0) : Collections.<Integer>emptyList();
            int i1 = r.nextInt(3);
            switch (i1) {
                case 0 : polimorphic[i] = new ArrayList<>(0);
                    break;
                case 1 : polimorphic[i] = new LinkedList<>();
                    break;
                case 2 : polimorphic[i] = Collections.emptyList();
                    break;
            }
            arrays[i] = new ArrayList<>(0);
            empty[i] = Collections.emptyList();
        }
    }

    @Benchmark
    public float arrayList() {
        List<Integer>[] l = arrays;
        int c = count;
        float result = 0;
        for (int i = 0; i < c; i++) {
            result += sum(l[i]);
        }
        return result;
    }

    @Benchmark
    public float emptyList() {
        List<Integer>[] l = empty;
        int c = count;
        float result = 0;
        for (int i = 0; i < c; i++) {
            result += sum(l[i]);
        }
        return result;
    }

    @Benchmark
    public float biList() {
        List<Integer>[] l = bimorphic;
        int c = count;
        float result = 0;
        for (int i = 0; i < c; i++) {
            result += sum(l[i]);
        }
        return result;
    }

    @Benchmark
    public float polyList() {
        List<Integer>[] l = polimorphic;
        int c = count;
        float result = 0;
        for (int i = 0; i < c; i++) {
            result += sum(l[i]);
        }
        return result;
    }

    int sum(List<Integer> list) {
        int sum = 0;
        for (int i = 0; i < list.size(); ++i) {
            sum += list.get(i);
        }
        return sum;
    }
}

Results are:

Benchmark               (count)  Mode  Cnt       Score       Error  Units
ExampleBench.arrayList    10000  avgt    5   22902.547 ± 27665.651  ns/op
ExampleBench.biList       10000  avgt    5   50459.552 ±   739.379  ns/op
ExampleBench.emptyList    10000  avgt    5    3745.469 ±   211.794  ns/op
ExampleBench.polyList     10000  avgt    5  164879.943 ±  5830.008  ns/op

In case of monomorphic and bimorphic calls JIT replaces virtual call by concrete implementations. For example in case of arrayList() we have the following output for -XX:+PrintInlining:

@ 27   edu.jvm.runtime.ExampleBench::sum (38 bytes)   inline (hot)
   @ 6   java.util.ArrayList::size (5 bytes)   accessor
    \-> TypeProfile (15648/15648 counts) = java/util/ArrayList

for emptyList():

@ 27   edu.jvm.runtime.ExampleBench::sum (38 bytes)   inline (hot)
   @ 6   java.util.Collections$EmptyList::size (2 bytes)   inline (hot)
    \-> TypeProfile (9913/9913 counts) = java/util/Collections$EmptyList

for biList():

@ 27   edu.jvm.runtime.ExampleBench::sum (38 bytes)   inline (hot)
   @ 6   java.util.Collections$EmptyList::size (2 bytes)   inline (hot)
   @ 6   java.util.ArrayList::size (5 bytes)   accessor
    \-> TypeProfile (2513/5120 counts) = java/util/ArrayList
    \-> TypeProfile (2607/5120 counts) = java/util/Collections$EmptyList

In case of polyList() JIT does not inline any implementation and uses true virtual call.

What is the advantages of using inline functions in these methods? Let's look at compiler-generated code for arrayList():

0x00007ff9e51bce50: cmp $0xf80036dc,%r10d     ;instance of 'java/util/ArrayList'
0x00007ff9e51bce57: jne L0000                 ;if false go to L0000 (invokeinterface size)
0x00007ff9e51bce59: mov 0x10(%rdx),%ebp       ;*getfield size optimization java.util.ArrayList::size@1 

.....

0x00007ff9e51bce6d: retq
             L0000: mov $0xffffffde,%esi      ; true virtual call starts here
0x00007ff9e51bce73: mov %rdx,(%rsp)
0x00007ff9e51bce77: callq 0x00007ff9e50051a0  ; OopMap{[0]=Oop off=92}
                                              ;*invokeinterface size
                                              ; - edu.jvm.runtime.ExampleBench::sum@6 (line 119)
                                              ;   {runtime_call}

As you can see JIT replaces virtual call by getfield.

like image 69
Ivan Mamontov Avatar answered Oct 08 '22 02:10

Ivan Mamontov


Collections.emptyList() always returns the same, immutable empty list object (a singleton). Creating an ArrayList on the other hand actually creates a new object, allocates memory, and that object must be GCed later.

There shouldn't be a significant difference, but Collections.emptyList() does less work. The operations are not functionally equivalent. One allows getting an immutable empty list, whereas the other allows creating a new mutable list. Choose one or the other based on the functionality you want. Not for performance reasons.

like image 39
JB Nizet Avatar answered Oct 08 '22 03:10

JB Nizet