While looking into the source code of the WrappingSpliterator::trySplit
, I was very mislead by it's implementation:
@Override
public Spliterator<P_OUT> trySplit() {
if (isParallel && buffer == null && !finished) {
init();
Spliterator<P_IN> split = spliterator.trySplit();
return (split == null) ? null : wrap(split);
}
else
return null;
}
And if you are wondering why this matters, is because for example this:
Arrays.asList(1,2,3,4,5)
.stream()
.filter(x -> x != 1)
.spliterator();
is using it. In my understanding the addition of any intermediate operation to a stream, will cause that code to be triggered.
Basically this method says that unless the stream is parallel, treat this Spliterator as one that can not be split, at all. And this matters to me. In one of my methods (this is how I got to that code), I get a Stream
as input and "parse" it in smaller pieces, manually, with trySplit
. You can think for example that I am trying to do a findLast
from a Stream
.
And this is where my desire to split in smaller chunks is nuked, because as soon as I do:
Spliterator<T> sp = stream.spliterator();
Spliterator<T> prefixSplit = sp.trySplit();
I find out that prefixSplit
is null
, meaning that I basically can't do anything else other than consume the entire sp
with forEachRemaning
.
And this is a bit weird, may be it makes some sense for when filter
is present; because in this case the only way (in my understanding) a Spliterator
could be returned is using some kind of a buffer
, may be even with a predefined size (much like Files::lines
). But why this:
Arrays.asList(1,2,3,4)
.stream()
.sorted()
.spliterator()
.trySplit();
returns null
is something I don't understand. sorted
is a stateful operation that buffers the elements anyway, without actually reducing or increasing their initial number, so at least theoretically this can return something other than null
...
When you invoke spliterator()
on a Stream
, there are only two possible outcomes with the current implementation.
If the stream has no intermediate operations you’ll get the source spliterator that has been used to construct the stream and whose splitting capability is entirely independent from the stream’s parallel state, as in fact, the spliterator doesn’t know anything about the stream.
Otherwise, you’ll get a WrappingSpliterator
, which will encapsulate a source Spliterator
and a pipeline state, expressed as PipelineHelper
. This combination of Spliterator
and PipelineHelper
does not need to work in parallel and, in fact, would not work in case of distinct()
, as the WrappingSpliterator
will get an entirely different combination, depending on whether the Stream is parallel or not.
For stateless intermediate operations, it would not make a difference though. But, as discussed in “Why the tryAdvance of stream.spliterator() may accumulate items into a buffer?”, the WrappingSpliterator
is a “one-fits-all implementation” that doesn’t consider the actual nature of the pipeline, so its limitations are the superset of all possible limitations of all supported pipeline stages. So the existence of one scenario that wouldn’t work when ignoring the parallel
flag is enough to forbid splitting for all pipelines when not being parallel
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With