I wrote this code to reduce a list of words to a long count of how many words start with an 'A'. I'm just writing it to learn Java 8, so I'd like to understand it a little better [Disclaimer: I realize this is probably not the best way to write this code; it's just for practice!].
Long countOfAWords = results.stream().reduce(
0L,
(a, b) -> b.charAt(0) == 'A' ? a + 1 : a,
Long::sum);
The middle parameter/lambda (called the accumulator) would seem to be capable of reducing the full list without the final 'Combiner' parameter. In fact, the Javadoc actually says:
The {@code accumulator} function acts as a fused mapper and accumulator, * which can sometimes be more efficient than separate mapping and reduction, * such as when knowing the previously reduced value allows you to avoid * some computation.
[Edit From Author] - The following statement is wrong, so don't let it confuse you; I'm just keeping it here so I don't ruin the original context of the answers.
Anyway, I can infer that the accumulator must just be outputting 1's and 0's which the combiner combines. I didn't find this particularly obvious from the documentation though.
My Question
Is there a way to see what the output would be before the combiner executes so I can see the list of 1's and 0's that the combiner combines? This would be helpful in debugging more complex situations which I'm sure I'll come across eventually.
Reducing is the repeated process of combining all elements. reduce operation applies a binary operator to each element in the stream where the first argument to the operator is the return value of the previous application and second argument is the current stream element.
In Java, reducing is a terminal operation that aggregates a stream into a type or a primitive type. Java 8 provides Stream API contains set of predefined reduction operations such as average(), sum(), min(), max(), and count(). These operations return a value by combining the elements of a stream.
Identity is the default result of reduction if there are no elements in the stream. That's the reason, this version of reduce method doesn't return Optional because it would at least return the identity element. Ignoring this rule will result in unexpected outcomes.
Java 8 introduced streams. Not to be confused with input/output streams, these Java 8+ streams can also process data that goes through them. It was hailed as a great new feature that allowed coders to write algorithms in a more readable (and therefore more maintainable) way.
The combiner does not reduce a list of 0's and 1's. When the stream is not run in parallel it's not used in this case so that the following loop is equivalent:
U result = identity;
for (T element : this stream)
result = accumulator.apply(result, element)
return result;
When you run the stream in parallel, the task is spanned into multiple threads. So for example the data in the pipeline is partitioned into chunks that evaluate and produce a result independently. Then the combiner is used to merge this results.
So you won't see a list that is reduced, but rather 2 values either the identity value or with another value computed by a task that are summed. For example if you add a print statement in the combiner
(i1, i2) -> {System.out.println("Merging: "+i1+"-"+i2); return i1+i2;});
you could see something like this:
Merging: 0-0
Merging: 0-0
Merging: 1-0
Merging: 1-0
Merging: 1-1
This would be helpful in debugging more complex situations which I'm sure I'll come across eventaully.
More generally if you want to see the data on the pipeline on the go you can use peek
(or the debugger could also help). So applied to your example:
long countOfAWords = result.stream().map(s -> s.charAt(0) == 'A' ? 1 : 0).peek(System.out::print).mapToLong(l -> l).sum();
which can output:
100100
[Disclaimer: I realize this is probably not the best way to write this code; it's just for practice!].
The idiomatic way to achieve your task would be to filter
the stream and then simply use count
:
long countOfAWords = result.stream().filter(s -> s.charAt(0) == 'A').count();
Hope it helps! :)
One way to see what's going on is to replace the method reference Long::sum
by a lambda that includes a println
.
List<String> results = Arrays.asList("A", "B", "A", "A", "C", "A", "A");
Long countOfAWords = results.stream().reduce(
0L,
(a, b) -> b.charAt(0) == 'A' ? a + 1 : a,
(a, b) -> {
System.out.println(a + " " + b);
return Long.sum(a, b);
});
In this case, we can see that the combiner is not actually used. This is because the stream is not parallel. All we are really doing is using the accumulator to successively combine each String
with the current Long
result; no two Long
values are ever combined.
If you replace stream
by parallelStream
you can see that the combiner is used and look at the values it combines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With