Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which operations preserve order [duplicate]

TL;DR; I am looking for a place where I can lookup whether a certain intermediate or terminal operation. Where can I find such documentation?

Edit This is not a duplicate of How to ensure order of processing in java8 streams?, since that question does not provide a comprehensive list of operations.

Background

The the package documentation says:

Whether or not a stream has an encounter order depends on the source and the intermediate operations

Which is repeated in this excellent stackoverflow answer

In order to ensure maintenance of ordering throughout an entire stream operation, you have to study the documentation of the stream’s source, all intermediate operations and the terminal operation for whether they maintain the order or not (or whether the source has an ordering in the first place).

That is all well and good, but which documentation should I look at? The the package documentation mentions in an example that map guarantees ordering, but it doesn't have a exhaustive list. The javadoc for the Stream class documents some intermediate operations, but not all. Take for example map:

Returns a stream consisting of the results of applying the given function to the elements of this stream.

This is an intermediate operation.

or filter

Returns a stream consisting of the elements of this stream that match the given predicate.

This is an intermediate operation.

None of these describe whether they preserve ordering.

This stackoverflow answer claims:

Actually every intermediate operation preserves an order by default. The only exceptions are:

  • unordered() which removes the ordering constraint.
  • sorted() which changes the order.

When it's not explicitly specified, you can assume that operation keeps the order. Even distinct() keeps the order, though it adds much complexity for parallel stream.

but is there any official documentation to back that up?

Extra credit 😉

The is actually has two separate ordering issues.

  1. Does the output of the operation maintain the same ordering as the input?
  2. Is the operation executed in-order on each element.

For example, a parallel map operation could go over all elements in an arbitrary order (violating 2.), but still maintain order in the returned stream (obeying 1.)

like image 249
Tobber Avatar asked Aug 31 '17 09:08

Tobber


People also ask

Does parallel stream preserve order?

If our Stream is ordered, it doesn't matter whether our data is being processed sequentially or in parallel; the implementation will maintain the encounter order of the Stream.

How to preserve the order of list in Python?

fromkeys(list)) that goes through two phases: (1) Convert the list to a dict using the dict. fromkeys() function with the list elements as keys and None as dict values. (2) Convert the dictionary back to a list using the list() constructor. As dictionaries preserve the order of the keys, the list ordering is preserved.


2 Answers

After some researching in the source code I summarized the following tables:

Taken from: Java streams - Part 6 - Spliterator

The following table showes which operations types are allowed to modify charactersiticts:

|                        | DISTICTS | SORTED | ORDERED | SIZED | SHORT_CIRCUIT |
| ---------------------- | -------- | ------ | ------- | ----- | --------------|
| source stream          | Y        | Y      | Y       | Y     | N             |
| intermediate operation | PCI      | PCI    | PCI     | PC    | PI            |
| terminal operation     | N        | N      | PC      | N     | PI            |
  • Y - Allowed to have
  • P - May preserves
  • C - May clears.
  • I - May injects.
  • N - Not valid; Irelevant for the operation.

Taken from Java streams - Stream methods characteristics table

The following table showes which characteristics and flags each intermediate operation/terminal operation may turns on and off: (SHORT_CIRCUIT is relevent only in the context of StreamOpFlag flags)

Note: P (Preserve) flag is added to every cell except the ones with the C and I (Clear and Inject) flags.

|                  |  DISTINCT |  SORTED |  ORDERED |  SIZED |  SHORT_CIRCUIT |
| ---------------- | ----------| --------| ---------| -------| ---------------|
|  filter          |           |         |          |  C     |                |
|  forEach         |           |         |  C       |        |                |
|  forEachOrdered  |           |         |          |        |                |
|  allMatch        |           |         |  C       |        |  I             |
|  distinct        |  I        |         |          |  C     |                |
|  flatMap         |  C        |  C      |          |  C     |                |
|  anyMatch        |           |         |  C       |        |  I             |
|  collect         |           |         |          |        |                |
|  unOrdered       |           |         |  C       |        |                |
|  count           |  C        |  C      |  C       |  C     |                |
|  findAny         |           |         |  C       |        |  I             |
|  findFirst       |           |         |          |        |  I             |
|  flatMapToXXX    |  C        |  C      |          |  C     |                |
|  limit           |           |         |          |  C     |  I             |
|  map             |  C        |  C      |          |        |                |
|  mapToXXX        |  C        |  C      |          |        |                |
|  max             |           |         |          |        |                |
|  min             |           |         |          |        |                |
|  noneMatch       |           |         |  C       |        |  I             |
|  peek            |           |         |          |        |                |
|  reduce          |           |         |          |        |                |
|  skip            |           |         |  C       |  I     |                |
|  sorted          |           |  I      |  I       |        |                |
|  toArray         |           |         |          |        |                |
  • C - Clears.
  • I - Injects.
like image 69
Stav Alfi Avatar answered Sep 21 '22 01:09

Stav Alfi


It's sort of sounds like two duplicates - since both the answers that you linked actually explain things. I can't tell if map or filter should specifically say that they preserver order; they don't rely on any previous state or any other state (these are stateless operations) so it is implied that they preserve order as far as I can see. I see it the other way around, if they don't preserver order - that should be explicitly mentioned in the docs; if it not obvious from the name of the operation. For example Stream.generate is not obvious to me if it generated an ordered stream; thus this is said in the documentation of it:

Returns an infinite sequential unordered stream where each element is generated by the provided Supplier.

sorted and unordered on the other hand are pretty obvious (IMO) to change order, at least when you put these in - you are explicitly saying that you do not care about the order. unordered btw will not do any randomizations on purpose to satisfy this, you can read more here.

There are two orders in general: processing order and encounter order.

You can think about encounter order as processing from left to right (imagine you have a List or an array). So if you have a pipeline that does not change order - elements will be fed to the Collector (or any other terminal operation) as seen from left to right. Well not all terminal operations are like this. One obvious difference is forEach and forEachOrdered; or a Collectors.toSet - which does not need to preserver the initial order at all. Or let's take findAny as the terminal operation - obviously you don't care which element you want, so why bother feeding the findAny in an exact order in the first place?

Processing order on the other hand has no defined order - especially visible for parallel processing obviously. So even if your pipeline is parallel (and elements are processed with absolutely no guarantee of any order), they will still be fed to the terminal operation in order - if such an order is needed for that terminal operation.

like image 27
Eugene Avatar answered Sep 21 '22 01:09

Eugene