Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TreeSet giving incorrect output - Java8

Tags:

java

treeset

While working with a tree set, I found very peculiar behavior.

As per my understanding following program should print two identical lines:

public class TestSet {
    static void test(String... args) {
        Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
        s.addAll(Arrays.asList("a", "b"));
        s.removeAll(Arrays.asList(args));
        System.out.println(s);
    }

    public static void main(String[] args) {
        test("A");
        test("A", "C");
    }
}

but strangely it prints:

[b]
[a, b] 

I am unable to understand - Why is tree set behaving like this?

like image 861
Sachin Sachdeva Avatar asked May 11 '17 13:05

Sachin Sachdeva


People also ask

Does TreeSet is synchronized or not?

The implementation of a TreeSet is not synchronized. This means that if multiple threads access a tree set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. This is typically accomplished by synchronizing some object that naturally encapsulates the set.

Does TreeSet allow duplicate keys?

TreeSet cannot contain duplicate elements.

What is difference between SortedSet and TreeSet?

TreeSet maintains an object in sorted order. SortedSet maintains an object in sorted order.

What is the difference between TreeSet and SortedSet select one?

What is the difference between TreeSet and SortedSet? Explanation: SortedSet is an interface. It maintains an ordered set of elements. TreeSet is an implementation of SortedSet.


3 Answers

This happens because a SortedSet’s Comparator is used for sorting, but removeAll relies on the equals method of each element. From the SortedSet documentation:

Note that the ordering maintained by a sorted set (whether or not an explicit comparator is provided) must be consistent with equals if the sorted set is to correctly implement the Set interface. (See the Comparable interface or Comparator interface for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a sorted set performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the sorted set, equal. The behavior of a sorted set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

The explanation of “consistent with equals” is defined in the Comparable documentation:

The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C. Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false.

It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave "strangely" when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.

In summary, your Set’s Comparator behaves differently than the elements’ equals method, causing unusual (though predictable) behavior.

like image 78
VGR Avatar answered Oct 21 '22 02:10

VGR


Well, this surprised me, I don't know if I'm correct, but look at this implementation in AbstractSet:

public boolean removeAll(Collection<?> c) {
  Objects.requireNonNull(c);
  boolean modified = false;

  if (size() > c.size()) {
    for (Iterator<?> i = c.iterator(); i.hasNext(); )
      modified |= remove(i.next());
  } else {
    for (Iterator<?> i = iterator(); i.hasNext(); ) {
      if (c.contains(i.next())) {
        i.remove();
        modified = true;
      }
    }
  }
  return modified;
}

Basically in your example, the size of set is equal to the size of arguments you want to remove, so the else condition is invoked. In that condition there is a check if your collection of arguments to remove contains the current element of iterator, and that check is case sensitive, so it checks if c.contains("a") and it returns false, because c contains "A", not "a", so the element is not removed. Notice that when you add an element to your set s.addAll(Arrays.asList("a", "b", "d")); it works correctly, because size() > c.size() is now true, thus there is no contains check anymore.

like image 15
Shadov Avatar answered Oct 21 '22 02:10

Shadov


To add some information about why the remove of TreeSet actually removes case-insensively in your example (and provided that you follow the if (size() > c.size()) path as explained in the answer by @Shadov) :

This is the removemethod in TreeSet :

public boolean remove(Object o) {
        return m.remove(o)==PRESENT;
    }

it calls remove from its internal TreeMap :

public V remove(Object key) {
    Entry<K,V> p = getEntry(key);
    if (p == null)
        return null;

    V oldValue = p.value;
    deleteEntry(p);
    return oldValue;
}

which calls getEntry

 final Entry<K,V> getEntry(Object key) {
        // Offload comparator-based version for sake of performance
        if (comparator != null)
            return getEntryUsingComparator(key);
        if (key == null)
            throw new NullPointerException();
        @SuppressWarnings("unchecked")
            Comparable<? super K> k = (Comparable<? super K>) key;
        Entry<K,V> p = root;
        while (p != null) {
            int cmp = k.compareTo(p.key);
            if (cmp < 0)
                p = p.left;
            else if (cmp > 0)
                p = p.right;
            else
                return p;
        }
        return null;
    }

If there is a Comparator (as in your example), the entry is searched based on this Comparator (this is done by getEntryUsingComparator), that's why it is actually found (then removed) , despite the case difference.

like image 3
Arnaud Avatar answered Oct 21 '22 02:10

Arnaud