Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obtaining a parallel Stream from a Collection

Is it correct that with Java 8 you need to execute the following code to surely obtain a parallel stream from a Collection?

private <E> void process(final Collection<E> collection) {
    Stream<E> stream = collection.parallelStream().parallel();
    //processing
}

From the Collection API:

default Stream parallelStream()

Returns a possibly parallel Stream with this collection as its source. It is allowable for this method to return a sequential stream.

From the BaseStream API:

S parallel()

Returns an equivalent stream that is parallel. May return itself, either because the stream was already parallel, or because the underlying stream state was modified to be parallel.

Is it not awkward that I need to call a function that supposedly parallellizes the stream twice?

like image 961
skiwi Avatar asked Oct 21 '22 12:10

skiwi


1 Answers

Basically the default implementation of Collection.parallelStream() does create a parallel stream. The implementation looks like this:

default Stream<E> parallelStream() {
    return StreamSupport.stream(spliterator(), true);
}

But this being a default method, it is perfectly valid for some implementing class to provide a different implementation to create a sequential stream too. For example, suppose I create a SequentialArrayList:

class MySequentialArrayList extends ArrayList<String> {
    @Override
    public Stream<String> parallelStream() {
        return StreamSupport.stream(spliterator(), false);
    }
}

For an object of that class, the following code will print false as expected:

ArrayList<String> arrayList = new MySequentialArrayList();
System.out.println(arrayList.parallelStream().isParallel());

In this case invoking BaseStream#parallel() method ensures that the stream returned is always parallel. Either it was already parallel, or it makes it parallel, by setting the parallel field to true:

public final S parallel() {
    sourceStage.parallel = true;
    return (S) this;
}

This is the implementation of AbstractPipeline#parallel() method.

So the following code for the same object will print true:

System.out.println(arrayList.parallelStream().parallel().isParallel());

But if the stream is already parallel, then yes it is an extra method invocation, but that will ensure you always get a parallel stream. I've not yet digged much into the parallelization of streams, so I can't comment on what kind of Collection or in what cases would parallelStream() give you a sequential stream though.

like image 104
Rohit Jain Avatar answered Oct 23 '22 02:10

Rohit Jain