Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Set.of() throw an IllegalArgumentException if the elements are duplicates?

In Java 9 new static factory methods were introduced on the Set interface, called of(), which accept multiple elements, or even an array of elements.

I wanted to turn a list into a set to remove any duplicate entries in the set, which can be done (prior to Java 9) using:

Set<String> set = new HashSet<>();
set.addAll(list);

But I thought it would be cool to use this new Java 9 static factory method doing:

Set.of(list.toArray())

where the list is a List of Strings, previously defined.

But alas, java threw an IllegalArgumentException when the elements were duplicates, also stated in the Javadoc of the method. Why is this?

Edit: this question is not a duplicate of another question about a conceptually equivalent topic, the Map.of() method, but distinctly different. Not all static factory of() methods behave the same. In other words, when I am asking something about the Set.of() method I would not click on a question dealing with the Map.of() method.

like image 215
Benjamin Avatar asked Dec 04 '17 08:12

Benjamin


People also ask

Why duplicates are not allowed in set?

Why set does not allowed duplicates,How it will work internally. The meaning of "sets do not allow duplicate values" is that when you add a duplicate to a set, the duplicate is ignored, and the set remains unchanged. This does not lead to compile or runtime errors: duplicates are silently ignored.

What happens when duplicate elements are added to a set?

If we insert duplicate values to the Set, we don't get any compile time or run time errors. It doesn't add duplicate values in the set. Below is the add() method of the set interface in java collection that returns Boolean value either TRUE or FALSE when the object is already present in the set.

What will happen if a set has duplicate entries Python?

Sets cannot contain duplicates. Duplicates are discarded when initializing a set. If adding an element to a set, and that element is already contained in the set, then the set will not change.

Does set accept duplicate values?

A Set is a Collection that cannot contain duplicate elements.


5 Answers

Set.of() is a short way of creating manually a small Set. In this case it would be a flagrant programming error if you gave it duplicate values, as you're supposed to write out the elements yourself. I.e. Set.of("foo", "bar", "baz", "foo"); is clearly an error on the programmer's part.

Your cool way is actually a really bad way. If you want to convert a List to a Set, you can do it with Set<Foo> foo = new HashSet<>(myList);, or any other way you wish (such as with streams and collecting toSet()). Advantages include not doing a useless toArray(), the choice of your own Set (you might want a LinkedHashSet to preserve order) etc. Disadvantages include having to type out a few more characters of code.

The original design idea behind the Set.of(), List.of() and Map.of() methods (and their numerous overloads) is explained here What is the point of overloaded Convenience Factory Methods for Collections in Java 9 and here, where it's mentioned that The focus is on small collections, which is something very common all around the internal API, so performance advantages can be had. Although currently the methods delegate to the varargs method offering no performance advantage, this can be easily changed (not sure what the hold-up is on that though).

like image 155
Kayaman Avatar answered Oct 13 '22 09:10

Kayaman


The Set.of() factory methods produce immutable Sets for a given number of elements.

In the variants that support a fixed number of arguments (static <E> Set<E> of​(), static <E> Set<E> of​(E e1), static <E> Set<E> of​(E e1,E e2), etc...) the requirement of not having duplicates are easier to understand - when you call the method Set.of(a,b,c), you are stating you wish to create an immutable Set of exactly 3 elements, so if the arguments contain duplicates, it makes sense to reject your input instead of producing a smaller Set.

While the Set<E> of​(E... elements) variant is different (if allows creating a Set of an arbitrary number of elements), it follows the same logic of the other variants. If you pass n elements to that method, you are stating you wish to create an immutable Set of exactly n elements, so duplicates are not allowed.

You can still create a Set from a List (having potential duplicates) in a one-liner using:

Set<String> set = new HashSet<>(list); 

which was already available before Java 9.

like image 31
Eran Avatar answered Oct 13 '22 11:10

Eran


You are expecting this to be a "last-wins", just like HashSet I guess, but this was a deliberate decision (as Stuart Marks - the creator of these explains). He even has an example like this:

Map.ofEntries(
   "!", "Eclamation"
   .... // lots of other entries
   ""
   "|", "VERTICAL_BAR"
);

The choice is that since this could be error-prone, they should prohibit it.

Also notice that Set.of() returns an immutable Set, so you could wrap your Set into:

Collections.unmodifiableCollection(new HashSet<>(list))
like image 29
Eugene Avatar answered Oct 13 '22 11:10

Eugene


The primary design goal of the List.of, Set.of, Map.of, and Map.ofEntries static factory methods is to enable programmers to create these collections by listing the elements explicitly in the source code. Naturally there is a bias toward a small number of elements or entries, because they're more common, but the relevant characteristic here is that the elements are listed out in the source code.

What should the behavior be if duplicate elements are provided to Set.of or duplicate keys provided to Map.of or Map.ofEntries? Assuming that the elements are listed explicitly in the source code, this is likely a programming error. Alternatives such as first-wins or last-wins seem likely to cover up errors silently, so we decided that making duplicates be an error was the best course of action. If the elements are explicitly listed, it would be nice if this were a compile-time error. However, the detection of duplicates isn't detected until runtime,* so throwing an exception at that time is the best we could do.

* In the future, if all the arguments are constant expressions or are constant-foldable, the Set or Map creation could also be evaluated at compile time and also constant-folded. This might enable duplicates to be detected at compile time.

What about the use case where you have a collection of elements and you want to deduplicate them? That's a different use case, and it's not well handled by Set.of and Map.ofEntries. You have to create an intermediate array first, which is quite cumbersome:

Set<String> set = Set.of(list.toArray());

This doesn't compile, because list.toArray() returns an Object[]. This will produce a Set<Object> which can't be assigned to Set<String>. You want toArray to give you a String[] instead:

Set<String> set = Set.of(list.toArray(new String[0]));

This typechecks, but it still throws an exception for duplicates! Another alternative was suggested:

Set<String> set = new HashSet<>(list);

This works, but you get back a HashSet, which is mutable, and which takes up a lot more space than the set returned from Set.of. You could deduplicate the elements through a HashSet, get an array from that, and then pass it to Set.of. That would work, but bleah.

Fortunately, this is fixed in Java 10. You can now write:

Set<String> set = Set.copyOf(list);

This creates an unmodifiable set from the elements of the source collection, and duplicates do not throw an exception. Instead, an arbitrary one of the duplicates is used. There are similar methods List.copyOf and Map.copyOf. As a bonus, these methods skip creating a copy if the source collection is already an unmodifiable collection of the right type.

like image 35
Stuart Marks Avatar answered Oct 13 '22 11:10

Stuart Marks


Set.of​(E... elements)

The element type of the resulting set will be the component type of the array, and the size of the set will be equal to the length of the array.

Throws:

IllegalArgumentException - if there are any duplicate elements

This is clear that this is not doing any duplicate test since the size of the Set will be the length of the array.

The method is just here to be able to get a populated Set in a one line

Set.of("A","B","C");

But you have to be careful on the duplicate yourself. (This will simply iterate the varargs and add them in the new Set.

like image 23
AxelH Avatar answered Oct 13 '22 10:10

AxelH