Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 findFirst and encounter order

Tags:

The JavaDocs for findFirst say that if the stream has an encounter order, then the first element will always be returned, but if the stream does not have an encounter order, any element may be returned.

I'm trying to demonstrate how this works on a stream without an encounter order, but I can't get it to return anything other than the actual first element.

I tried adding the elements to a Set, which does not have a defined encounter order:

    Set<String> words = new HashSet<>();     words.addAll(Arrays.asList("this", "is", "a", "stream", "of", "strings"));     Optional<String> firstString = words.stream()             .findFirst();     System.out.println(firstString); 

Every time I run, I get a as the first string. Then I tried doing a Collections.shuffle on the List before adding it to the Set, but that didn't change anything.

    List<String> wordList = Arrays.asList("this", "is", "a", "stream", "of", "strings");     words = new HashSet<>();     words.addAll(wordList);     firstString = words.stream()             .findFirst();     System.out.println(firstString); 

I still get back the word a every time.

Then I tried using the unordered method from BaseStream, which claims to return a stream without encounter order, but no difference:

    firstString = Stream.of("this", "is", "a", "stream", "of", "strings")             .unordered()             .findFirst();     System.out.println(firstString); 

Now I get the word this every time. Am I missing something? Is there some way to demonstrate that findFirst on an unordered stream returns different values?

like image 944
kousen Avatar asked Jan 27 '17 12:01

kousen


People also ask

Which is faster findAny or findFirst?

When Stream is unordered, findFirst() and findAny() are the same. But when Stream is ordered, findAny() will be better.

What is the difference between findFirst () and findAny ()?

In Java 8 Stream, the findFirst() returns the first element from a Stream, while findAny() returns any element from a Stream.

What does findFirst () do in Java?

The findFirst() method finds the first element in a Stream. So, we use this method when we specifically want the first element from a sequence. When there is no encounter order, it returns any element from the Stream. According to the java.

What does findFirst () do in Java stream?

Stream findFirst() in Java with examples Stream findFirst() returns an Optional (a container object which may or may not contain a non-null value) describing the first element of this stream, or an empty Optional if the stream is empty. If the stream has no encounter order, then any element may be returned.


2 Answers

Well, “any” includes the possibility of “first”. Of course, the Stream implementation does not waste efforts in randomizing the data, so for a lot of cases, especially with sequential execution, it will still be the first element, if we can call it that way (as without an order, there is no distinguished first element).

Your best chances of exhibiting different results for findFirst are with parallel Streams. But even there, not every combination of operations is suitable of exhibiting the unorderedness.

One point is, that in the current implementation, the findFirst() operation does not change it’s behavior when the Stream is unordered, i.e. it does not actively try to be like findAny(). It may still exhibit unpredictable behavior due to the source of the Stream, but if your source is Stream.of("this", "is", "a", "stream", "of", "strings"), i.e. an immutable sequence of a known size, it has already the best parallel performance possible, so there’s simply no way of drawing a benefit of the chained unordered(), hence, the current implementation doesn’t change its behavior.

It might surprise, but this even applies to a HashSet to some degree. While it has an unspecified order, there will be an actual order within its backing array at some point of time and as long as you don’t modify the Set, there will be no reason to shuffle these entries around, so for a particular HashSet instance, you may repeatedly get the same “first” element, though it’s not specified which one and even within a single runtime, another HashSet instance representing the same contents, but having a different history, may have a different order.


One example of an operation that is known to draw a benefit from the unordered characteristics is distinct. While it has to sort out duplicates, it has to keep the first encountered of equal elements, if it makes a notable difference. This can degrade the performance significantly, hence, the implementation will immediately try to get a benefit if the stream is unordered. E.g.

List<String> equal=IntStream.range(0, 100)     .mapToObj(i->new String("test")) // don't do this in normal code     .collect(Collectors.toList()); Map<String, Integer> map = IntStream.range(0, equal.size())     .collect(IdentityHashMap::new, (m,i)->m.put(equal.get(i),i), Map::putAll);  equal.parallelStream().distinct().map(map::get)      .findFirst().ifPresent(System.out::println); 

This creates a bunch of equal but distinguishable String instances (which you normally shouldn’t do), registers them with their positional number in an IdentityHashMap, so we can find out, which instance distinct has retained. Since the code above uses an ordered stream created by a List, it consistently prints 0, regardless of how often you execute it.

In contrast,

equal.parallelStream().unordered().distinct().map(map::get)      .findFirst().ifPresent(System.out::println); 

will print arbitrary numbers of the range, as we have released the ordered contract and allow to pick any of the equal strings.


As already noted before, this is all implementation specific. You should never make an assumption about whether an operation can actually draw a benefit and thus, will change its behavior for unordered streams. The explanation above was only meant to illustrate why sometimes the behavior of a particular implementation might not change for unordered stream. Though, it still might in the next version or a different JRE implementation.

like image 135
Holger Avatar answered Oct 05 '22 08:10

Holger


Holger has already ably explained the situation. (+1) I'd like to provide a demonstration of HashSet instances that have the same contents but that have different iteration order. First we create a set as before:

    List<String> wordList = Arrays.asList("this", "is", "a", "stream", "of", "strings");     Set<String> words = new HashSet<>(wordList); 

We create another set of words, add a bunch of stuff (doesn't matter exactly what it is), and then remove it:

    Set<String> words2 = new HashSet<>(wordList);     IntStream.range(0, 50).forEachOrdered(i -> words2.add(String.valueOf(i)));     words2.retainAll(wordList); 

If we inspect the results as follows:

    System.out.println(words.equals(words2));     System.out.println(words);     System.out.println(words2); 

we can see from the output that the sets are equal but iterate in a different order:

true [a, strings, stream, of, this, is] [this, is, strings, stream, of, a] 

As noted elsewhere, if you get a stream from these and call findFirst(), the result is the first element in iteration order, which will clearly differ between these sets.

What happened is that by adding and removing a bunch of elements, we've caused the set to increase its internal table size, requiring the elements to be rehashed. The original elements end up in different relative positions in the new table, even after the new elements have been removed.

Although HashSets have no specified iteration order, the order is likely to be repeatable (and even predictable) if the set is initialized with the same contents in the same way every time. We thus say that the stream from a set has no defined encounter order, even though the order is often the same each time.

Note that in JDK 9, the new immutable sets (and maps) are actually randomized, so their iteration orders will change from run to run, even if they are initialized the same way every time.

like image 33
Stuart Marks Avatar answered Oct 05 '22 07:10

Stuart Marks