Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't removing from a TreeSet with a custom comparator remove a larger set of items?

Using both Java 8 and Java 11, consider the following TreeSet with a String::compareToIgnoreCase comparator:

final Set<String> languages = new TreeSet<>(String::compareToIgnoreCase);
languages.add("java");
languages.add("c++");
languages.add("python");

System.out.println(languages);                 // [c++, java, python]

When I try to remove the exact elements present in the TreeSet, it works: all of those specified are removed:

languages.removeAll(Arrays.asList("PYTHON", "C++"));

System.out.println(languages);                 // [java]

However, if I try to remove instead more than is present in the TreeSet, the call doesn't remove anything at all (this is not a subsequent call but called instead of the snippet above):

languages.removeAll(Arrays.asList("PYTHON", "C++", "LISP"));

System.out.println(languages);                 // [c++, java, python]

What am I doing wrong? Why does it behave this way?

Edit: String::compareToIgnoreCase is a valid comparator:

(l, r) -> l.compareToIgnoreCase(r)
like image 828
Nikolas Charalambidis Avatar asked Jan 01 '20 13:01

Nikolas Charalambidis


People also ask

Can we use comparator with TreeSet?

TreeSet shares an important function of setting and returning the comparator that can be used to order the elements in a TreeSet. The method returns a Null value if the set follows the natural ordering pattern of the elements.

How does TreeSet remove work?

TreeSet remove() Method in Java TreeSet. remove(Object O) method is to remove a particular element from a Tree set. Parameters: The parameter O is of the type of Tree set and specifies the element to be removed from the set.


1 Answers

Here's the javadoc of removeAll():

This implementation determines which is the smaller of this set and the specified collection, by invoking the size method on each. If this set has fewer elements, then the implementation iterates over this set, checking each element returned by the iterator in turn to see if it is contained in the specified collection. If it is so contained, it is removed from this set with the iterator's remove method. If the specified collection has fewer elements, then the implementation iterates over the specified collection, removing from this set each element returned by the iterator, using this set's remove method.

In your second experiment, you're in the first case of the javadoc. So it iterates over "java", "c++", etc. and checks if they're contained in the Set returned by Set.of("PYTHON", "C++"). They're not, so they're not removed. Use another TreeSet using the same comparator as argument, and it should work fine. Using two different Set implementations, one using equals(), and the other one using a comparator, is a dangerous thing to do indeed.

Note that there is a bug opened about this: [JDK-8180409] TreeSet removeAll inconsistent behaviour with String.CASE_INSENSITIVE_ORDER.

like image 133
JB Nizet Avatar answered Oct 18 '22 05:10

JB Nizet