I want to take an input and apply parallel stream on that, then I want output as list. Input could be any List or any collection on which we can apply streams.
My concerns here is that if we want output as map them we have an option from java is like
list.parallelStream().collect(Collectors.toConcurrentMap(args))
But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output. I see one more option there to use
list.parallelStream().collect(Collectors.toCollection(<Concurrent Implementation>))
in this way we can provide various concurrent implementations in collect method. But I think there is only CopyOnWriteArrayList List implementation is present in java.util.concurrent. We could use various queue implementation here but those will not be like list. What I mean here is that we can workaround to get the list.
Could you please guide me what is the best way if I want the output as list?
Note: I could not find any other post related to this, any reference would be helpful.
To create a parallel stream from a Collection use the parallelStream() method.
Overview. Java 8 introduced the concept of Streams as an efficient way of carrying out bulk operations on data. And parallel Streams can be obtained in environments that support concurrency. These streams can come with improved performance – at the cost of multi-threading overhead.
The Collection
object used to receive the data being collected does not need to be concurrent. You can give it a simple ArrayList
.
That is because the collection of values from a parallel stream is not actually collected into a single Collection
object. Each thread will collect their own data, and then all sub-results will be merged into a single final Collection
object.
This is all well-documented in the Collector
javadoc, and the Collector
is the parameter you're giving to the collect()
method:
<R,A> R collect(Collector<? super T,A,R> collector)
But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output
. This is entirely wrong.
The whole point in streams is that you can use a non-thread safe Collection to achieve perfectly valid thread-safe results. This is because of how streams are implemented (and this was a key part of the design of streams). You could see that a Collector
defines a method supplier
that at each step will create a new instance. Those instances will be merged between them.
So this is perfectly thread safe:
Stream.of(1,2,3,4).parallel()
.collect(Collectors.toList());
Since there are 4 elements in this stream, there will be 4 instances of ArrayList
created that will be merged at the end to a single result (assuming at least 4 CPU cores)
On the other side methods like toConcurrent
generate a single result container and all threads will put their result into it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With