Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Enhanced for loop performance worse than traditional indexed lookup?

I just came across this seemingly innocuous comment, benchmarking ArrayList vs a raw String array. It's from a couple years ago, but the OP writes

I did notice that using for String s: stringsList was about 50% slower than using an old-style for-loop to access the list. Go figure...

Nobody commented on it in the original post, and the test seemed a little dubious (too short to be accurate), but I nearly fell out of my chair when I read it. I've never benchmarked an enhanced loop against a "traditional" one, but I'm currently working on a project that does hundreds of millions of iterations over ArrayList instances using enhanced loops so this is a concern to me.

I'm going to do some benchmarking and post my findings here, but this is obviously a big concern to me. I could find precious little info online about relative performance, except for a couple offhand mentions that enhanced loops for ArrayLists do run a lot slower under Android.

Has anybody experienced this? Does such a performance gap still exist? I'll post my findings here, but was very surprised to read it. I suspect that if this performance gap did exist, it has been fixed in more modern VM's, but I guess now I'll have to do some testing and confirm.

Update: I made some changes to my code, but was already suspecting what others here have already pointed out: sure the enhanced for loop is slower, but outside of very trivial tight loops, the cost should be a miniscule fraction of the cost of the logic of the loop. In my case, even though I'm iterating over very large lists of strings using enhanced loops, my logic inside the loop is complex enough that I couldn't even measure a difference after switching to index-based loops.

TL;DR: enhanced loops are indeed slower than a traditional index-based loop over an arraylist; but for most applications the difference should be negligible.

like image 905
jkraybill Avatar asked Jul 27 '11 04:07

jkraybill


People also ask

What do you think are the advantages of an enhanced for loop What are the disadvantages or limitations?

The advantage of the for-each loop is that it eliminates the possibility of bugs and makes the code more readable. It is known as the for-each loop because it traverses each element one by one. The drawback of the enhanced for loop is that it cannot traverse the elements in reverse order.

Is enhanced for loop faster than for loop?

In general, enhanced for loop is much easier to use and less error-prone than for loop, where you need to manage the steps manually. At the same time, for loop is much more powerful because you get the opportunity to control the looping process.

What are the advantages of using an enhanced loop?

MAJOR BENEFIT: This aids readability and clarity. To sum up, the enhanced for loop offers a concise higher level syntax to loop over a list or array which improves clarity and readability. However, it misses some parts: allowing to access the index loop or to remove an item.


3 Answers

The problem you have is that using an Iterator will be slower than using a direct lookup. On my machine the difference is about 0.13 ns per iteration. Using an array instead saves about 0.15 ns per iteration. This should be trivial in 99% of situations.

public static void main(String... args) {
    int testLength = 100 * 1000 * 1000;
    String[] stringArray = new String[testLength];
    Arrays.fill(stringArray, "a");
    List<String> stringList = new ArrayList<String>(Arrays.asList(stringArray));
    {
        long start = System.nanoTime();
        long total = 0;
        for (String str : stringArray) {
            total += str.length();
        }
        System.out.printf("The for each Array loop time was %.2f ns total=%d%n", (double) (System.nanoTime() - start) / testLength, total);
    }
    {
        long start = System.nanoTime();
        long total = 0;
        for (int i = 0, stringListSize = stringList.size(); i < stringListSize; i++) {
            String str = stringList.get(i);
            total += str.length();
        }
        System.out.printf("The for/get List loop time was %.2f ns total=%d%n", (double) (System.nanoTime() - start) / testLength, total);
    }
    {
        long start = System.nanoTime();
        long total = 0;
        for (String str : stringList) {
            total += str.length();
        }
        System.out.printf("The for each List loop time was %.2f ns total=%d%n", (double) (System.nanoTime() - start) / testLength, total);
    }
}

When run with one billion entries entries prints (using Java 6 update 26.)

The for each Array loop time was 0.76 ns total=1000000000
The for/get List loop time was 0.91 ns total=1000000000
The for each List loop time was 1.04 ns total=1000000000

When run with one billion entries entries prints (using OpenJDK 7.)

The for each Array loop time was 0.76 ns total=1000000000
The for/get List loop time was 0.91 ns total=1000000000
The for each List loop time was 1.04 ns total=1000000000

i.e. exactly the same. ;)

like image 108
Peter Lawrey Avatar answered Jan 01 '23 23:01

Peter Lawrey


Every claim that X is slower than Y on a JVM which does not address all the issues presented in this article ant it's second part spreads fears and lies about the performance of a typical JVM. This applies to the comment referred to by the original question as well as to GravityBringer's answer. I am sorry to be so rude, but unless you use appropriate micro benchmarking technology your benchmarks produce really badly skewed random numbers.

Tell me if you're interested in more explanations. Although it is all in the articles I referred to.

like image 30
jmg Avatar answered Jan 01 '23 23:01

jmg


GravityBringer's number doesn't seem right, because I know ArrayList.get() is as fast as raw array access after VM optimization.

I ran GravityBringer's test twice on my machine, -server mode

50574847
43872295
30494292
30787885
(2nd round)
33865894
32939945
33362063
33165376

The bottleneck in such tests is actually memory read/write. Judging from the numbers, the entire 2 arrays are in my L2 cache. If we decrease the size to fit L1 cache, or if we increase the size beyond L2 cache, we'll see 10X throughput difference.

The iterator of ArrayList uses a single int counter. Even if VM doesn't put it in a register (the loop body is too complex), at least it will be in the L1 cache, therefore r/w of are basically free.

The ultimate answer of course is to test your particular program in your particular environment.

Though it's not helpful to play agnostic whenever a benchmark question is raised.

like image 24
irreputable Avatar answered Jan 01 '23 21:01

irreputable