Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance benefits of a static empty array instance

It seems common practice to extract a constant empty array return value into a static constant. Like here:

public class NoopParser implements Parser {
    private static final String[] EMPTY_ARRAY = new String[0];

    @Override public String[] supportedSchemas() {
        return EMPTY_ARRAY;
    }

    // ...
}

Presumably this is done for performance reasons, since returning new String[0] directly would create a new array object every time the method is called – but would it really?

I've been wondering if there really is a measurable performance benefit in doing this or if it's just outdated folk wisdom. An empty array is immutable. Is the VM not able to roll all empty String arrays into one? Can the VM not make new String[0] basically free of cost?

Contrast this practice with returning an empty String: we're usually perfectly happy to write return "";, not return EMPTY_STRING;.

like image 569
glts Avatar asked Dec 15 '15 23:12

glts


People also ask

Why is an array empty?

An empty array is essentially immutable, so caching the instance isn't a problem. And it allows you to forego special-casing empty array creation in your algorithms if you find yourself eyeballing an elegant code path that is creating tons of empty arrays.

What does an empty array return?

If the array has a length of zero, then it does not contain any element. To return an empty array from a function, we can create a new array with a zero size. In the example below, we create a function returnEmptyArray() that returns an array of int . We return new int[0] that is an empty array of int .

What is an empty array C#?

To empty an array in C#, use the Array Clear() method: The Array. Clear method in C# clears i.e.zeros out all elements. In the below example, we have first considered an array with three elements − int[] arr = new int[] {88, 45, 76}; Now we have used the Array.Clear method to zero out all the arrays − Array.

How do you use an empty array in Java?

To create an empty array, you can use an array initializer. The length of the array is equal to the number of items enclosed within the braces of the array initializer. Java allows an empty array initializer, in which case the array is said to be empty.


Video Answer


2 Answers

I'm most interested in the actual performance difference between these two idioms in practical, real-world situations. I have no experience in micro-benchmarking (and it is probably not the right tool for such a question) but I gave it a try anyway.

This benchmark models a somewhat more typical, "realistic" setting. The returned array is just looked at and then discarded. No references hanging around, no requirement for reference equality.

One interface, two implementations:

public interface Parser {
    String[] supportedSchemas();
    void parse(String s);
}
public class NoopParserStaticArray implements Parser {
    private static final String[] EMPTY_STRING_ARRAY = new String[0];

    @Override public String[] supportedSchemas() {
        return EMPTY_STRING_ARRAY;
    }

    @Override public void parse(String s) {
        s.codePoints().count();
    }
}
public class NoopParserNewArray implements Parser {
    @Override public String[] supportedSchemas() {
        return new String[0];
    }

    @Override public void parse(String s) {
        s.codePoints().count();
    }
}

And the JMH benchmark:

import org.openjdk.jmh.annotations.Benchmark;

public class EmptyArrayBenchmark {
    private static final Parser NOOP_PARSER_STATIC_ARRAY = new NoopParserStaticArray();
    private static final Parser NOOP_PARSER_NEW_ARRAY = new NoopParserNewArray();

    @Benchmark
    public void staticEmptyArray() {
        Parser parser = NOOP_PARSER_STATIC_ARRAY;
        for (String schema : parser.supportedSchemas()) {
            parser.parse(schema);
        }
    }

    @Benchmark
    public void newEmptyArray() {
        Parser parser = NOOP_PARSER_NEW_ARRAY;
        for (String schema : parser.supportedSchemas()) {
            parser.parse(schema);
        }
    }
}

The result on my machine, Java 1.8.0_51 (HotSpot VM):

Benchmark                              Mode  Cnt           Score          Error  Units
EmptyArrayBenchmark.staticEmptyArray  thrpt   60  3024653836.077 ± 37006870.221  ops/s
EmptyArrayBenchmark.newEmptyArray     thrpt   60  3018798922.045 ± 33953991.627  ops/s
EmptyArrayBenchmark.noop              thrpt   60  3046726348.436 ±  5802337.322  ops/s

There is no significant difference between the two approaches in this case. In fact, they are indistinguishable from the no-op case: apparently the JIT compiler recognises that the returned array is always empty and optimises the loop away entirely!

Piping parser.supportedSchemas() into the black hole instead of looping over it, gives the static array instance approach a ~30% advantage. But they're definitely of the same magnitude:

Benchmark                              Mode  Cnt           Score         Error  Units
EmptyArrayBenchmark.staticEmptyArray  thrpt   60   338971639.355 ±  738069.217  ops/s
EmptyArrayBenchmark.newEmptyArray     thrpt   60   266936194.767 ±  411298.714  ops/s
EmptyArrayBenchmark.noop              thrpt   60  3055609298.602 ± 5694730.452  ops/s

Perhaps in the end the answer is the usual "it depends". I have a hunch that in many practical scenarios, the performance benefit in factoring out the array creation is not significant.

I think it is fair to say that

  • if the method contract gives you the freedom to return a new empty array instance every time, and
  • unless you need to guard against problematic or pathological usage patterns and/or aim for theoretical max performance,

then returning new String[0] directly is fine.

Personally, I like the expressiveness and concision of return new String[0]; and not having to introduce an extra static field.


By some strange coincidence, a month after I wrote this a real performance engineer investigated the problem: see this section in Alexey Shipilёv's blog post 'Arrays of Wisdom of the Ancients':

As expected, the only effect whatsoever can be observed on a very small collection sizes, and this is only a marginal improvement over new Foo[0]. This improvement does not seem to justify caching the array in the grand scheme of things. As a teeny tiny micro-optimization, it might make sense in some tight code, but I wouldn’t care otherwise.

That settles it. I'll take the tick mark and dedicate it to Alexey.

like image 97
glts Avatar answered Sep 22 '22 18:09

glts


I will go out on a limb and say that the performance benefit, even though using constant is much faster, is not actually relevant; because the software will likely spend a lot more time in doing other things besides returning empty arrays. If the total run-time is even hours a few extra seconds spent in creating an array does not mean much. By the same logic, memory consumption is not relevant either.

The only reason I can think of for doing this is readability.

like image 36
Miserable Variable Avatar answered Sep 25 '22 18:09

Miserable Variable