The Set
interface makes no promises on whether implementations permit null
elements. Each implementation is supposed to declare this in its documentation.
Collectors.toSet()
promises to return an implementation of Set
but explicitly makes “no guarantees on the type, mutability, serializability, or thread-safety of the Set
returned”. Null-safety is not mentioned.
The current implementation of Collectors.toSet()
in OpenJDK always uses HashSet
, which permits null elements, but this could change in the future and other implementations may do differently.
If a Set
implementation prohibits null
elements, it throws NullPointerException
at various times, in particular during an attempt to add(null)
. It would seem that if Collectors.toSet()
decided to use a null-intolerant Set
implementation, calling stream.collect(Collectors.toSet())
on a Stream stream
would throw. The specification of collect
does not list any exceptions, nor does the specification of any of the Collector
methods. This could suggest that the collect
call permits nulls within stream
, but on the other hand it’s not clear whether this actually means much at all, as NullPointerException
is an unchecked exception and doesn’t strictly have to be listed.
Is this specified more clearly anywhere else? In particular, is the following code guaranteed not to throw? Is it guaranteed to return true
?
import java.util.stream.*;
class Test {
public static boolean setContainsNull() {
return Stream.of("A", "list", "of", null, "strings")
.collect(Collectors.toSet())
.contains(null);
}
}
If not, then I assume we should always ensure a stream contains no nulls before using Collectors.toSet()
or be ready to handle NullPointerException
. (Is this exception alone enough though?) Alternatively, when this is unacceptable or hard, we can request a specific set implementation using code like Collectors.toCollection(HashSet::new)
.
Edit: there is an existing question that sounds superficially similar, and this question got closed as a supposed duplicate of that. However, the linked question does not address Collectors.toSet()
at all. Moreover, the answers to that question form the underlying assumptions of my question. That question asks: are nulls allowed in streams? Yes. But what happens when a (perfectly allowed) stream that contains nulls gets collected via a standard collector?
In Java Map, you can have null key and null values as well. But when you are using Collectors. toMap() be wary not to have a KeyMapper and ValueMapper that will give null keys or null values.
Remove duplicates using toSet()In Kotlin, we can use toSet() function available in Collection functions to remove duplicates. Note: Maintain the original order of items.
We can use lambda expression str -> str!= null inside stream filter() to filter out null values from a stream.
There is a difference between deliberately unspecified behaviors, like “type, mutability, serializability, or thread-safety” and underspecified behavior, like the null
support.
Whenever a behavior is underspecified, the actual behavior of the reference implementation tends to become the matter of fact that can’t be changed later, even if counteracting the original intention, due to compatibility constraints, or at least it can’t be changed without a strong reason.
Note that while the reserved right to return a truly immutable or non-serializable Set
was not used, simply because no such type existed upon the Java 8 release, enforcing a non-null
behavior was possible even without the existence of an adequate hash map type, just like groupingBy
forbids null
keys, though underspecified as well.
Note further that while the groupingBy
collector deliberately rejects null
keys in its implementation code, toMap
is a good example of how actual behavior becomes part of the contract. In Java 8, toMap
allows null
keys but rejects null
values, simply because it invokes Map.merge
which has that behavior. It seems, this wasn’t an intended behavior in the first place. Now, in Java 9, the toMap
collector without a merge function doesn’t use Map.merge
anymore (JDK-8040892, see also this answer), but deliberately rejects null
values in the collector code, to be behavioral compatible with the previous version. Simply because it was never said that the null
behavior is intentionally unspecified.
So, Collectors.toSet()
(and likewise Collectors.toList()
) allow null
values for two major Java versions now and there’s no specification saying that you must not take this for granted, so you can be quite sure that this won’t change in the future.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With