.toArray(new MyClass[0]) or .toArray(new MyClass[myList.size()])?

Question

Counterintuitively, the fastest version, on Hotspot 8, is:

MyClass[] arr = myList.toArray(new MyClass[0]);

I have run a micro benchmark using jmh the results and code are below, showing that the version with an empty array consistently outperforms the version with a presized array. Note that if you can reuse an existing array of the correct size, the result may be different.

Benchmark results (score in microseconds, smaller = better):

Benchmark                      (n)  Mode  Samples    Score   Error  Units
c.a.p.SO29378922.preSize         1  avgt       30    0.025 ▒ 0.001  us/op
c.a.p.SO29378922.preSize       100  avgt       30    0.155 ▒ 0.004  us/op
c.a.p.SO29378922.preSize      1000  avgt       30    1.512 ▒ 0.031  us/op
c.a.p.SO29378922.preSize      5000  avgt       30    6.884 ▒ 0.130  us/op
c.a.p.SO29378922.preSize     10000  avgt       30   13.147 ▒ 0.199  us/op
c.a.p.SO29378922.preSize    100000  avgt       30  159.977 ▒ 5.292  us/op
c.a.p.SO29378922.resize          1  avgt       30    0.019 ▒ 0.000  us/op
c.a.p.SO29378922.resize        100  avgt       30    0.133 ▒ 0.003  us/op
c.a.p.SO29378922.resize       1000  avgt       30    1.075 ▒ 0.022  us/op
c.a.p.SO29378922.resize       5000  avgt       30    5.318 ▒ 0.121  us/op
c.a.p.SO29378922.resize      10000  avgt       30   10.652 ▒ 0.227  us/op
c.a.p.SO29378922.resize     100000  avgt       30  139.692 ▒ 8.957  us/op

For reference, the code:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
public class SO29378922 {
  @Param({"1", "100", "1000", "5000", "10000", "100000"}) int n;
  private final List<Integer> list = new ArrayList<>();
  @Setup public void populateList() {
    for (int i = 0; i < n; i++) list.add(0);
  }
  @Benchmark public Integer[] preSize() {
    return list.toArray(new Integer[n]);
  }
  @Benchmark public Integer[] resize() {
    return list.toArray(new Integer[0]);
  }
}

You can find similar results, full analysis, and discussion in the blog post Arrays of Wisdom of the Ancients. To summarize: the JVM and JIT compiler contains several optimizations that enable it to cheaply create and initialize a new correctly sized array, and those optimizations can not be used if you create the array yourself.

As of ArrayList in Java 5, the array will be filled already if it has the right size (or is bigger). Consequently

MyClass[] arr = myList.toArray(new MyClass[myList.size()]);

will create one array object, fill it and return it to "arr". On the other hand

MyClass[] arr = myList.toArray(new MyClass[0]);

will create two arrays. The second one is an array of MyClass with length 0. So there is an object creation for an object that will be thrown away immediately. As far as the source code suggests the compiler / JIT cannot optimize this one so that it is not created. Additionally, using the zero-length object results in casting(s) within the toArray() - method.

See the source of ArrayList.toArray():

public <T> T[] toArray(T[] a) {
    if (a.length < size)
        // Make a new array of a's runtime type, but my contents:
        return (T[]) Arrays.copyOf(elementData, size, a.getClass());
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
        a[size] = null;
    return a;
}

Use the first method so that only one object is created and avoid (implicit but nevertheless expensive) castings.

From JetBrains Intellij Idea inspection:

There are two styles to convert a collection to an array: either using a pre-sized array (like c.toArray(new String[c.size()])) or using an empty array (like c.toArray(new String[0]).

In older Java versions using pre-sized array was recommended, as the reflection call which is necessary to create an array of proper size was quite slow. However since late updates of OpenJDK 6 this call was intrinsified, making the performance of the empty array version the same and sometimes even better, compared to the pre-sized version. Also passing pre-sized array is dangerous for a concurrent or synchronized collection as a data race is possible between the size and toArray call which may result in extra nulls at the end of the array, if the collection was concurrently shrunk during the operation.

This inspection allows to follow the uniform style: either using an empty array (which is recommended in modern Java) or using a pre-sized array (which might be faster in older Java versions or non-HotSpot based JVMs).

Modern JVMs optimise reflective array construction in this case, so the performance difference is tiny. Naming the collection twice in such boilerplate code is not a great idea, so I'd avoid the first method. Another advantage of the second is that it works with synchronised and concurrent collections. If you want to make optimisation, reuse the empty array (empty arrays are immutable and can be shared), or use a profiler(!).

toArray checks that the array passed is of the right size (that is, large enough to fit the elements from your list) and if so, uses that. Consequently if the size of the array provided it smaller than required, a new array will be reflexively created.

In your case, an array of size zero, is immutable, so could safely be elevated to a static final variable, which might make your code a little cleaner, which avoids creating the array on each invocation. A new array will be created inside the method anyway, so it's a readability optimisation.

Arguably the faster version is to pass the array of a correct size, but unless you can prove this code is a performance bottleneck, prefer readability to runtime performance until proven otherwise.

The first case is more efficient.

That is because in the second case:

MyClass[] arr = myList.toArray(new MyClass[0]);

the runtime actually creates an empty array (with zero size) and then inside the toArray method creates another array to fit the actual data. This creation is done using reflection using the following code (taken from jdk1.5.0_10):

public <T> T[] toArray(T[] a) {
    if (a.length < size)
        a = (T[])java.lang.reflect.Array.
    newInstance(a.getClass().getComponentType(), size);
System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
        a[size] = null;
    return a;
}

By using the first form, you avoid the creation of a second array and also avoid the reflection code.

.toArray(new MyClass[0]) or .toArray(new MyClass[myList.size()])?

Tags:

java

performance

coding-style

Recent Activity

Donate For Us

.toArray(new MyClass[0]) or .toArray(new MyClass[myList.size()])?

Tags:

java

performance

coding-style

Related questions

Recent Activity

Donate For Us