Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Java 8, does a sequential and ordered Stream guarantee performing operations in encounter order?

Is there any guarantee that operations on a sequential and ordered Stream are processed in encounter order?

I mean, if I have code like this:

IntStream.range(0, 5)
        .map(i -> {
            myFunction(i);
            return i * 2;
         })
         .boxed()
         .collect(toList());

is there a guarantee, that it will perform myFunction() calls in the encounter order of the generated range?

I found draft JavaDocs for the Stream class which explicitely states this:

For sequential stream pipelines, all operations are performed in the encounter order of the pipeline source, if the pipeline source has a defined encounter order.

but in the official JavaDocs this line was removed. It now discusses encounter order only for selected methods. The Package java.util.stream doc in the Side-effects paragraph states:

Even when a pipeline is constrained to produce a result that is consistent with the encounter order of the stream source (for example, IntStream.range(0,5).parallel().map(x -> x*2).toArray() must produce [0, 2, 4, 6, 8]), no guarantees are made as to the order in which the mapper function is applied to individual elements, or in what thread any behavioral parameter is executed for a given element.

but it says nothing about sequential streams and the example is for a parallel stream (My understanding is that it's true for both sequential and parallel streams, but this is the part I'm not sure about).

On the other hand, it also states in the Ordering section:

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]. However, if the source has no defined encounter order, then any permutation of the values [2, 4, 6] would be a valid result.

but this time it start with "operating on the elements", but the example is about the resulting stream, so I'm not sure they are taking side-effects in account and side-effects is really what this question is about.

like image 616
piotrek Avatar asked May 17 '16 13:05

piotrek


1 Answers

I think we can learn a lot from the fact that this explicit sentence has been removed. This question seems to be closely related to the question “Does Stream.forEach respect the encounter order of sequential streams?”. The answer from Brian Goetz basically says that despite the fact that there’s no scenario where the order is ignored by the Stream’s current implementation when forEach is invoked on a sequential Stream, forEach has the freedom to ignore the encounter order even for sequential Streams per specification.

Now consider the following section of Stream’s class documentation:

To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline consists of a source (which might be an array, a collection, a generator function, an I/O channel, etc), zero or more intermediate operations (which transform a stream into another stream, such as filter(Predicate)), and a terminal operation (which produces a result or side-effect, such as count() or forEach(Consumer)). Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed.

Since it is the terminal operation which determines whether elements are needed and whether they are needed in the encounter order, a terminal action’s freedom to ignore the encounter order also implies consuming, hence processing, the elements in an arbitrary order.

Note that not only forEach can do that. A Collector has the ability to report an UNORDERED characteristic, e.g. Collectors.toSet() does not depend on the encounter order. It’s obvious that also an operation like count() doesn’t depend on the order—in Java 9 it may even return without any element processing. Think of IntStream#sum() for another example.

In the past, the implementation was too eager in propagating an unordered characteristic up the stream, see “Is this a bug in Files.lines(), or am I misunderstanding something about parallel streams?” where the terminal operation affected the outcome of a skip step, which is the reason why the current implementation is reluctant about such optimizations to avoid similar bugs, but that doesn’t preclude the reappearance of such optimizations, then being implemented with more care…

So at the moment it’s hard to imagine how an implementation could ever gain a performance benefit from exploiting the freedom of unordered evaluations in a sequential Stream, but, as stated in the forEach-related question, that doesn’t imply any guarantees.

like image 127
Holger Avatar answered Nov 13 '22 15:11

Holger