Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 - External Iteration performing better than Internal Iteration?

So I was reading a book for Java 8, when I saw them doing comparison between external and internal iteration and thought about comparing the two, performance wise.

I have a method that just sums up a sequence of integers up to n.

Iterative one:

private static long iterativeSum(long n) {

    long startTime = System.nanoTime();
    long sum = 0;

    for(long i=1; i<=n; i++) {
        sum+=i;
    }


    long endTime = System.nanoTime();
    System.out.println("Iterative Sum Duration: " + (endTime-startTime)/1000000);

    return sum;
}

Sequential One - using internal iteration

private static long sequentialSum(long n) {

    long startTime = System.nanoTime();

    //long sum = LongStream.rangeClosed(1L, n)
    long sum = Stream.iterate(1L, i -> i+1)
            .limit(n)
            .reduce(0L, (i,j) -> i+j);

    long endTime = System.nanoTime();
    System.out.println("Sequential Sum Duration: " + (endTime-startTime)/1000000);

    return sum;
}

I tried to do some benchmarking on them, it turns out that the one using external iteration performs far better than the one using internal iteration.

Here's my driver code:

public static void main(String[] args) {

    long n = 100000000L;

    for(int i=0;i<10000;i++){
    iterativeSum(n);
    sequentialSum(n);
    }
    iterativeSum(n);
    sequentialSum(n);
}

The running time for the Iteravtive one was always < 50ms whereas the execution time for Sequential one was always > 250ms.

I am not able to understand why internal iteration is not out performing external iteration here?

like image 204
AgentX Avatar asked Nov 07 '15 17:11

AgentX


People also ask

What is the difference between Java 8 internal and external iteration?

There are two types of iterators: external and internal. An external iterator is active, an internal is passive. When the client (i.e. the programmer) controls the iteration, the iterator is called external iterator. When the iterator controls it, it is called an internal iterator.

What are the benefits of using an external iterator as opposed to an internal iterator?

- An external iterator is implemented by a separate class which can be attached to the object which has iteration logic. - The advantage of external iterator is that, many iterators can be made active simultaneously on the existing or same object.

What is the difference between an external iterator and an internal iterator describe an advantage of the external iterator?

External Iterator :- Using this we have to iterate all element one-by-one and do some operation because programmer does have control on that, its a External Iterator. Internal Iterator :- Using this we can iterate according to our condition, Programmer can control over on it, its a Internal Iterator .

What is external iteration in Java?

External Iterators Definition(or Active Iterators) – With external iterators responsibility of iterating over the elements, and making sure that this iteration takes into account the total number of records, whether more records exist to be iterated and so on lies with the programmer.


1 Answers

Even though the presented results are completely irrelevant, the observed effect actually takes place: the Stream API do has an overhead which for such simple tasks cannot be totally eliminated in real applications even after warm-up. Let's write a JMH benchmark:

@Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(3)
@State(Scope.Benchmark)
public class IterativeSum {
    @Param({ "100", "10000", "1000000" })
    private int n;

    public static long iterativeSum(long n) {
        long sum = 0;

        for(long i=1; i<=n; i++) {
            sum+=i;
        }
        return sum;
    }

    @Benchmark
    public long is() {
        return iterativeSum(n);
    }
}

Here's baseline test: plain loop. The results on my box are the following:

Benchmark             (n)  Mode  Cnt     Score     Error  Units
IterativeSum.is       100  avgt   30     0.074 ±   0.001  us/op
IterativeSum.is     10000  avgt   30     6.361 ±   0.009  us/op
IterativeSum.is   1000000  avgt   30   688.527 ±   0.910  us/op

Here's your version of Stream API based iteration:

public static long sequentialSumBoxed(long n) {
    return Stream.iterate(1L, i -> i+1).limit(n)
                 .reduce(0L, (i,j) -> i+j);
}

@Benchmark
public long ssb() {
    return sequentialSumBoxed(n);
}

The results look like this:

Benchmark             (n)  Mode  Cnt     Score     Error  Units
IterativeSum.ssb      100  avgt   30     1.253 ±   0.084  us/op
IterativeSum.ssb    10000  avgt   30   134.959 ±   0.421  us/op
IterativeSum.ssb  1000000  avgt   30  9119.422 ±  22.817  us/op

Very disappointing: 13-21x slower. This version has many boxing operations inside, that's why primitive stream specializations were created. Let's check non-boxed version:

public static long sequentialSum(long n) {
    return LongStream.iterate(1L, i -> i+1).limit(n)
                     .reduce(0L, (i,j) -> i+j);
}

@Benchmark
public long ss() {
    return sequentialSum(n);
}

The results are the following:

Benchmark             (n)  Mode  Cnt     Score     Error  Units
IterativeSum.ss       100  avgt   30     0.661 ±   0.001  us/op
IterativeSum.ss     10000  avgt   30    67.498 ±   5.732  us/op
IterativeSum.ss   1000000  avgt   30  1982.687 ±  38.501  us/op

Much better now, but still 2.8-10x times slower. An alternative would be to use the range:

public static long rangeSum(long n) {
    return LongStream.rangeClosed(1, n).sum();
}

@Benchmark
public long rs() {
    return rangeSum(n);
}

The results are the following:

Benchmark             (n)  Mode  Cnt     Score     Error  Units
IterativeSum.rs       100  avgt   30     0.316 ±   0.001  us/op
IterativeSum.rs     10000  avgt   30    28.646 ±   0.065  us/op
IterativeSum.rs   1000000  avgt   30  2158.962 ± 514.780  us/op

Now it's 3.1-4.5x times slower. The cause of this slowness is that Stream API has very long call chain which hits the MaxInlineLevel JVM limit, so it cannot be fully inlined by default. You may increase this limit setting like -XX:MaxInlineLevel=20 and get the following result:

Benchmark             (n)  Mode  Cnt     Score     Error  Units
IterativeSum.rs       100  avgt   30     0.111 ±   0.001  us/op
IterativeSum.rs     10000  avgt   30     9.552 ±   0.017  us/op
IterativeSum.rs   1000000  avgt   30   729.935 ±  31.915  us/op

Much better: now it's only 1.05-1.5x slower.

The problem with this test is that the loop body of iterative version is very simple, thus could be efficiently unrolled and vectorized by JIT-compiler, and it's much harder to do this with the same efficiency for sophisticated Stream API code. However in real applications you're unlikely to sum consequtive numbers in a loop (why not write n*(n+1)/2 instead?). With real problems Stream API overhead is much lower even with default MaxInlineLevel setting.

like image 136
Tagir Valeev Avatar answered Oct 22 '22 13:10

Tagir Valeev