This is the implementation of the java.util.stream.Collectors
class's toSet()
method:
public static <T>
Collector<T, ?, Set<T>> toSet() {
return new CollectorImpl<>((Supplier<Set<T>>) HashSet::new, Set::add,
(left, right) -> { left.addAll(right); return left; },
CH_UNORDERED_ID);
}
As we can see, it uses a HashSet
and calls add
. From the HashSet
documentation, "It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time."
In the following code, a List
of String
is streamed, sorted and collected into a Set
:
public static void main(String[] args) {
Set<String> strings = Arrays.asList("c", "a", "b")
.stream()
.sorted()
.collect(Collectors.toSet());
System.out.println(strings.getClass());
System.out.println(strings);
}
This provides the output:
class java.util.HashSet
[a, b, c]
The output is sorted. What I think is happening here is that although the contract provided by the HashSet
documentation specifies that ordering is not something it provides, the implementation happens to add in order. I suppose this could change in future versions / vary between JVMs and that a wiser approach would be to do something like Collectors.toCollection(TreeSet::new)
.
Can sorted()
be relied upon when calling Collectors.toSet()
?
Additionally, what exactly does "it does not guarantee that the order will remain constant over time" mean? (I suppose add
, remove
, the resizing of the underlying array?)
In this tutorial, we will explore the Stream API methods: sorted (), count (), and distinct () methods introduced in Java 8. 1. Introduction Before diving deep into the practice stuff let us understand the methods we will be covering in this tutorial.
On this page we will provide java 8 Stream sorted () example. We can sort the stream in natural ordering as well as ordering provided by Comparator. In java 8 Comparator can be instantiated using lambda expression. We can also reverse the natural ordering as well as ordering provided by Comparator.
Java Stream collect () is mostly used to collect the stream elements to a collection. It’s a terminal operation. It takes care of synchronization when used with a parallel stream. The Collectors class provides a lot of Collector implementation to help us out. Want to learn more? Join the DigitalOcean Community!
Find the syntax of sorted () method. 1. sorted (): It sorts the elements of stream using natural ordering. The element class must implement Comparable interface. 2. sorted (Comparator<? super T> comparator): Here we create an instance of Comparator using lambda expression. We can sort the stream elements in ascending and descending order.
To answer that question, you have to know a bit about how HashSet
is implemented. As the name suggests, a HashSet
is implemented using a hash table. Basically, a hash table is an array that is indexed by element hashes. A hash function (in Java, an object's hash is calculated by object.hashCode()
) is basically a function that meets a few criteria:
.equals()
each other have identical hashesSo, when you meed a HashSet
that is "sorted" (which is understood as "the iterator preserves the natural order of elements"), this is due to a couple of coincidences:
hashCode
sIf you look into the String
class hashCode()
method, you will see that for one-letter strings, the hash code corresponds to the Unicode index (codepoint) of the letter - so in this specific case, as long as the hash table is small enough, the elements will be sorted. However, this is a huge coincidence and
and moreover, this has nothing to do with the fact that sorted()
was called on the stream - it's simply due to the way hashCode()
is implemented and therefore the ordering of the hash table. Therefore, the simple answer to the question is "no".
The answer is no. Once you added the items into a Set you cannot rely on any order. From JDK sourcecode (HashSet.java):
/**
* Returns an iterator over the elements in this set. The elements
* are returned in no particular order.
*
* @return an Iterator over the elements in this set
* @see ConcurrentModificationException
*/
public Iterator<E> iterator() {
return map.keySet().iterator();
}
Now, in previous versions of the JDK even though an order wasn't guaranteed, you'd usually get the items in the same order of insertion (unless the class of the objects implements either the order of creation of the objects or the order of invocation of hashCode()
and then you'll get the order that is dictated by hashCode()
).hashCode()
on the objects. As @Holgar mentions in the comments below, in HotSpot it's the latter. And you can't even count on that since there are exceptions to this as well since the sequential number is not the only ingredient in the hashCode generator.
I recently heard a talk from Stuart Marks (the guy who's responsible for a re-write of a major part of Collections in Java 9) and he said that they've added randomization to the iteration order of Sets (created by new set-factories) in Java 9. If you want to hear the session, the part that he talk about sets start here - good talk, highly recommended by the way!.
So even if you used to count on iteration order of Sets, once you move to Java 9 you should stop doing so.
All that said, if you need order you should consider using a SortedSet
,
LinkedHashSet
or TreeSet
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With