Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Java’s Collectors.toSet() guaranteed to permit nulls?

The Set interface makes no promises on whether implementations permit null elements. Each implementation is supposed to declare this in its documentation.

Collectors.toSet() promises to return an implementation of Set but explicitly makes “no guarantees on the type, mutability, serializability, or thread-safety of the Set returned”. Null-safety is not mentioned.

The current implementation of Collectors.toSet() in OpenJDK always uses HashSet, which permits null elements, but this could change in the future and other implementations may do differently.

If a Set implementation prohibits null elements, it throws NullPointerException at various times, in particular during an attempt to add(null). It would seem that if Collectors.toSet() decided to use a null-intolerant Set implementation, calling stream.collect(Collectors.toSet()) on a Stream stream would throw. The specification of collect does not list any exceptions, nor does the specification of any of the Collector methods. This could suggest that the collect call permits nulls within stream, but on the other hand it’s not clear whether this actually means much at all, as NullPointerException is an unchecked exception and doesn’t strictly have to be listed.

Is this specified more clearly anywhere else? In particular, is the following code guaranteed not to throw? Is it guaranteed to return true?

import java.util.stream.*;

class Test {
    public static boolean setContainsNull() {
        return Stream.of("A", "list", "of", null, "strings")
                     .collect(Collectors.toSet())
                     .contains(null);
    }
}

If not, then I assume we should always ensure a stream contains no nulls before using Collectors.toSet() or be ready to handle NullPointerException. (Is this exception alone enough though?) Alternatively, when this is unacceptable or hard, we can request a specific set implementation using code like Collectors.toCollection(HashSet::new).

Edit: there is an existing question that sounds superficially similar, and this question got closed as a supposed duplicate of that. However, the linked question does not address Collectors.toSet() at all. Moreover, the answers to that question form the underlying assumptions of my question. That question asks: are nulls allowed in streams? Yes. But what happens when a (perfectly allowed) stream that contains nulls gets collected via a standard collector?

like image 988
Chortos-2 Avatar asked Nov 09 '17 22:11

Chortos-2


People also ask

Does collectors toMap allow null values?

In Java Map, you can have null key and null values as well. But when you are using Collectors. toMap() be wary not to have a KeyMapper and ValueMapper that will give null keys or null values.

Does toSet remove duplicates?

Remove duplicates using toSet()In Kotlin, we can use toSet() function available in Collection functions to remove duplicates. Note: Maintain the original order of items.

How do you handle null in stream filters?

We can use lambda expression str -> str!= null inside stream filter() to filter out null values from a stream.


1 Answers

There is a difference between deliberately unspecified behaviors, like “type, mutability, serializability, or thread-safety” and underspecified behavior, like the null support.

Whenever a behavior is underspecified, the actual behavior of the reference implementation tends to become the matter of fact that can’t be changed later, even if counteracting the original intention, due to compatibility constraints, or at least it can’t be changed without a strong reason.

Note that while the reserved right to return a truly immutable or non-serializable Set was not used, simply because no such type existed upon the Java 8 release, enforcing a non-null behavior was possible even without the existence of an adequate hash map type, just like groupingBy forbids null keys, though underspecified as well.

Note further that while the groupingBy collector deliberately rejects null keys in its implemen­tation code, toMap is a good example of how actual behavior becomes part of the contract. In Java 8, toMap allows null keys but rejects null values, simply because it invokes Map.merge which has that behavior. It seems, this wasn’t an intended behavior in the first place. Now, in Java 9, the toMap collector without a merge function doesn’t use Map.merge anymore (JDK-8040892, see also this answer), but deliberately rejects null values in the collector code, to be behavioral compatible with the previous version. Simply because it was never said that the null behavior is intentionally unspecified.

So, Collectors.toSet() (and likewise Collectors.toList()) allow null values for two major Java versions now and there’s no specification saying that you must not take this for granted, so you can be quite sure that this won’t change in the future.

like image 115
Holger Avatar answered Sep 29 '22 00:09

Holger