Consider this (completely contrived) Java code:
final List<Integer> s = Arrays.asList(1, 2, 3);
final int[] a = new int[1];
a[0] = 100;
s.parallelStream().forEach(i -> {
synchronized (a) {
a[0] += i;
}
});
System.out.println(a[0]);
Is this code guaranteed to output "106"?
It seems like it is not unless there is a happens-before relationship established by parallelStream()
, by which we can know for sure that the first accesses to a[0]
in the lambda will see 100
and not zero (according to my understanding of the Java memory model).
But Collection.parallelStream()
is not documented to establish such a relationship...
The same question can be asked for the completion of the parallelStream()
method invocation.
So am I missing something, or is it true that for correctness would the above code be required to look something like this instead:
final List<Integer> s = Arrays.asList(1, 2, 3);
final int[] a = new int[1];
synchronized (a) {
a[0] = 100;
}
s.parallelStream().forEach(i -> {
synchronized (a) {
a[0] += i;
}
});
synchronized (a) {
System.out.println(a[0]);
}
Or... does parallelStream()
actually provide these happens-before relationships, and this simply a matter of some missing documentation?
I'm asking because from an API design perspective, it seems (to me at least) like this would be a logical thing to do... analogous to Thread.start()
, etc.
The parallelStream() method of the Collection interface can be used to create a parallel stream with a collection as the datasource. No other code is necessary for parallel execution, as the data partitioning and thread management for a parallel stream are handled by the API and the JVM.
To create a parallel stream from another stream, use the parallel() method. To create a parallel stream from a Collection use the parallelStream() method.
In case of Parallel stream,4 threads are spawned simultaneously and it internally using Fork and Join pool to create and manage threads. Parallel streams create ForkJoinPool instance via static ForkJoinPool.
parallelStream() streams are always parallel... And then there is Spliterator , and in particular its . characteristics() , one of them being that it can be CONCURRENT , or even IMMUTABLE .
You really should avoid hitting variables 'outside' the pipeline. Even if you get it to work correctly performance will likely suffer. There are a lot of tools to achieve this built into the JDK. For example your use case is probably safer with something like:
Integer reduce = IntStream.of(1, 2, 3)
.parallel()
.reduce(100, (accumulator, element) -> accumulator + element);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With