The code shown below does output:
[b]
[a, b]
However I would expect it to print two identical lines in the output.
import java.util.*;
public class Test{
static void test(String... abc) {
Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
s.addAll(Arrays.asList("a", "b"));
s.removeAll(Arrays.asList(abc));
System.out.println(s);
}
public static void main(String[] args) {
test("A");
test("A", "C");
}
}
The spec clearly states that removeAll
"Removes all this collection's elements that are also contained in the specified collection."
So from my understanding current behavior is unpredictable . Please help me understand this
You only read documentation partly. You forgot one important paragraph from TreeSet
:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the
Set
interface. (SeeComparable
orComparator
for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but aTreeSet
instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
Now removeAll
implementation comes from AbstractSet
and utilizes equals
method. According to your code you will have that "a".equals("A")
is not true
so that elements are not considered equal even if you provided a comparator which manages them when used in the TreeSet
itself. If you try with a wrapper then the problem goes away:
import java.util.*;
import java.lang.*;
class Test
{
static class StringWrapper implements Comparable<StringWrapper>
{
public final String string;
public StringWrapper(String string)
{
this.string = string;
}
@Override public boolean equals(Object o)
{
return o instanceof StringWrapper &&
((StringWrapper)o).string.compareToIgnoreCase(string) == 0;
}
@Override public int compareTo(StringWrapper other) {
return string.compareToIgnoreCase(other.string);
}
@Override public String toString() { return string; }
}
static void test(StringWrapper... abc)
{
Set<StringWrapper> s = new TreeSet<>();
s.addAll(Arrays.asList(new StringWrapper("a"), new StringWrapper("b")));
s.removeAll(Arrays.asList(abc));
System.out.println(s);
}
public static void main(String[] args)
{
test(new StringWrapper("A"));
test(new StringWrapper("A"), new StringWrapper("C"));
}
}
This because you are now providing a consistent implementation between equals
and compareTo
of your object so you never have incoherent behavior between how the objects are added inside the sorted set and how all the abstract behavior of the set uses them.
This is true in general, a sort of rule of three for Java code: if you implement compareTo
or equals
or hashCode
you should always implement all of them to avoid problems with standard collections (even if hashCode is less crucial unless you are using these objects in any hashed collection). This is specified many times around java documentation.
This is an inconsistency in the implementation of TreeSet<E>
, bordering on the bug. The code will ignore custom comparator when the number of items in the collection that you pass to removeAll
is greater than or equal to the number of items in the set.
The inconsistency is caused by a small optimization: if you look at the implementation of removeAll
, which is inherited from AbstractSet
, the optimization goes as follows:
public boolean removeAll(Collection<?> c) {
boolean modified = false;
if (size() > c.size()) {
for (Iterator<?> i = c.iterator(); i.hasNext(); )
modified |= remove(i.next());
} else {
for (Iterator<?> i = iterator(); i.hasNext(); ) {
if (c.contains(i.next())) {
i.remove();
modified = true;
}
}
}
return modified;
}
you can see that the behavior is different when c
has fewer items than this set (top branch) vs. when it has as many or more items (bottom branch).
Top branch uses the comparator associated with this set, while the bottom branch uses equals
for comparison c.contains(i.next())
- all in the same method!
You can demonstrate this behavior by adding a few extra elements to the original tree set:
s.addAll(Arrays.asList("x", "z", "a", "b"));
Now the output for both test cases becomes identical, because remove(i.next())
utilizes the comparator of the set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With