Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IntelliJ IDEA suggests replacing for loops with foreach method. Should I always do that when possible?

IDEA suggests to replace, for example, this:

for (Point2D vertex : graph.vertexSet()) {
  union.addVertex(vertex);
}

with this:

graph.vertexSet().forEach(union::addVertex);

This new version is sure much more readable. But are there situations when I'd better stick to the good old language construct for iteratables rather than using the new foreach method?

For instance, if I understand correctly, the method reference mechanism implies constructing an anonymous Consumer object that otherwise (with for language construct) would not be constructed. Could that become a performance bottleneck for some actions?

So I wrote this not very exhaustive benchmark:

package org.sample;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Threads;
import org.openjdk.jmh.infra.Blackhole;
import org.tendiwa.geometry.Point2D;

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class LanguageConstructVsForeach {
    private static final int NUMBER_OF_POINTS = 10000;
    private static final List<Point2D> points = IntStream
        .range(0, NUMBER_OF_POINTS)
        .mapToObj(i -> new Point2D(i, i * 2))
        .collect(Collectors.toList());

    @Benchmark
    @Threads(1)
    @Fork(3)
    public void languageConstructToBlackhole(Blackhole bh) {
        for (Point2D point : points) {
            bh.consume(point);
        }
    }
    @Benchmark
    @Threads(1)
    @Fork(3)
    public void foreachToBlackhole(Blackhole bh) {
        points.forEach(bh::consume);
    }
    @Benchmark
    @Threads(1)
    @Fork(3)
    public List<Point2D> languageConstructToList(Blackhole bh) {
        List<Point2D> list = new ArrayList<>(NUMBER_OF_POINTS);
        for (Point2D point : points) {
            list.add(point);
        }
        return list;
    }
    @Benchmark
    @Threads(1)
    @Fork(3)
    public List<Point2D> foreachToList(Blackhole bh) {
        List<Point2D> list = new ArrayList<>(NUMBER_OF_POINTS);
        points.forEach(list::add);
        return list;
    }

}

And got:

Benchmark                                                       Mode  Samples      Score     Error  Units
o.s.LanguageConstructVsForeach.foreachToBlackhole              thrpt       60  33693.834 ± 894.138  ops/s
o.s.LanguageConstructVsForeach.foreachToList                   thrpt       60   7753.941 ± 239.081  ops/s
o.s.LanguageConstructVsForeach.languageConstructToBlackhole    thrpt       60  16043.548 ± 644.432  ops/s
o.s.LanguageConstructVsForeach.languageConstructToList         thrpt       60   6499.527 ± 202.589  ops/s

How comes foreach is more efficient in both cases: when I do virtually nothing and when I do some actual work? Doesn't foreach simply encapsulate Iterator? Is this benchmark even correct? If it is, is there any reason today to use the old language construct with Java 8?

like image 877
gvlasov Avatar asked Oct 28 '14 18:10

gvlasov


2 Answers

You're comparing the language's "enhanced-for" loop with the Iterable.forEach() method. The benchmark isn't obviously wrong, and the results might seem surprising, until you dig into the implementations.

Note that the points list is an instance of ArrayList since that's what's created by the Collectors.toList() collector.

The enhanced-for loop on an Iterable gets an Iterator from it and then calls hasNext() and next() repeatedly until there are no more elements. (This differs from the enhanced-for loop over an array, which does arithmetic and direct array element access.) Thus, when looping over an Iterable, this loop will perform a minimum of two method calls per iteration.

By contrast, calling ArrayList.forEach() runs a conventional, int-based for-loop over the array containing the list elements, and calls the lambda once per iteration. There's only one call per iteration here, as opposed to two calls per iteration for the enhanced-for loop. That might explain why ArrayList.forEach() is faster in this case.

The blackhole case seems to do very little work other than run the loops, so these cases seem to be measuring pure loop overhead. That may be why ArrayList.forEach() shows such a big advantage here.

When the loop does just a little bit of work (adding to a destination list) there is still a speed advantage for ArrayList.forEach(), but it's a much smaller difference. I suspect that if you were to do more work inside the loop, the advantage would be even smaller. This shows that the loop overhead for either construct is very small. Try using BlackHole.consumeCPU() in the loop. I wouldn't be surprised if the results between the two constructs becomes indistinguishable.

Note that the big speed advantage occurs because Iterable.forEach() ends up having a specialized implementation in ArrayList.forEach(). If you were to run forEach() over a different data structure, you'd probably get different results.

I wouldn't use this as justification for blindly replacing all enhanced-for loops with calls to Iterable.forEach(). Write the code that's the clearest and that makes the most sense. If you are writing performance critical code, benchmark it! Different forms will have different performance, depending on the workload, what data structure is being traversed, etc.

like image 126
Stuart Marks Avatar answered Oct 25 '22 03:10

Stuart Marks


One obvious reason to use the older style is that you will be compatible with Java 7. If your code uses lots of newfangled Java 8 goodies that is not an option, but if you are only using a few new features this could be an advantage, especially if your code is in a general purpose library.

like image 38
user949300 Avatar answered Oct 25 '22 02:10

user949300