Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stream.reduce() vs Stream.parallel.reduce()

I really want to know the exact difference between Stream.reduce() and Stream.parallel.reduce()

To clear everything I created a small program and found that result is not equal with same values .

public class Test {

    public static void main(String[] args) {
        int a = Stream.of(1, 2, 3).map(i -> i * 10).reduce(5, (abc, cde) -> abc + cde);
        int b = Stream.of(1, 2, 3).map(i -> i * 10).
        parallel().reduce(5, (abc, cde) -> abc + cde);
        System.out.println(a == b) //False;
    }
}

So, does this means that they both are different if so please help me understand how they are different in functionality wise ?

like image 544
Sachin Sachdeva Avatar asked Oct 16 '17 10:10

Sachin Sachdeva


People also ask

Is parallel stream faster than stream?

The performance of both streams degrades fast when the number of values increases. However, the parallel stream performs worse than the sequential stream in all cases.

What are the fundamental difference between the 2 kinds of reduction stream reduce and stream collect?

reduce() method always creates a new value whereas collect() method updates or mutates an existing value. 2. reduce() performs reduction whereas collect() performs mutable reduction.


1 Answers

It seem that you are misusing the reduce function. When using reduce with an identity value, you have to make sure the identity corresponds to an identity on the associative reduce function.

See the full documentation, and a good explanation of what reduce does here. The reduce javadoc says:

The identity value must be an identity for the accumulator function. This means that for all t, accumulator.apply(identity, t) is equal to t. The accumulator function must be an associative function.

In your case, 5 is not the identity of the + function you are using for reducing, thus leading to strange results when using parallel reduces. 0 is the identity of addition, so a correct way to compute would be to add 5 to the list, and use reduce(0, (x, y) -> x + y)). Additionally, since you are reducing a stream of int to an int, you can simply use reduce((x, y) -> x + y).

The reason is that parallel reduce uses the information that identity is a mathematical identity to optimize for parallel execution. In your case, it will inject multiple identityvalues in the computation.

like image 106
tonio Avatar answered Sep 23 '22 05:09

tonio