Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do Java methods headSet and tailSet in Java class TreeSet work in log(N) time?

It is written in the JavaDoc that the basic operations on a TreeSet work in log(N) time, where N is the size of the set. It seems to me that the headSet and tailSet methods should find the beginning of the views they're calculating via something like a binary search if the set is big enough, but the JavaDoc is silent on this.

like image 923
Blaise Zydeco Avatar asked Oct 03 '18 10:10

Blaise Zydeco


People also ask

What is headSet in TreeSet in Java?

The headSet(E toElement) method is used to return a view of the portion of this set whose elements are strictly less than toElement(input). The returned set is backed by this set, so changes in the returned set are reflected in this set, and vice-versa.

Which of the following methods does TreeSet use to determine order and equality?

It uses compare() (or compareTo()) method to determine the equality of two elements.

How do Treesets work internally?

When we implement a TreeSet, it creates a TreeMap to store the elements. It sorts the elements either naturally or using the user define comparator. When the object of a TreeSet is created, it automatically invokes the default constructor and creates an object of TreeMap and assigns comparator as null.


1 Answers

The docs say nothing about headSet and tailSet's time complexity. All they say is:

Returns a view of the portion of this set whose elements are strictly less than toElement. The returned set is backed by this set, so changes in the returned set are reflected in this set, and vice-versa. The returned set supports all optional set operations that this set supports.

(I'm emphasizing view). And view is indeed the most important part. Creating views is always a O(1) operation, worst case, as only wrapper classes are created. No key search is performed, just type checks, and in fact no other operations are triggered, either.

Here's TreeSet.headSet(E toElement) code:

public SortedSet<E> headSet(E toElement) {
    return headSet(toElement, false);
}

And here's TreeSet.headSet(E toElement, boolean inclusive) code:

public NavigableSet<E> headSet(E toElement, boolean inclusive) {
    return new TreeSet<>(m.headMap(toElement, inclusive));
}

As you may know, TreeSet is backed by a TreeMap instance. This is what the m property actually is. So the operation above delegates to the TreeMap.headMap(E toElement, boolean inclusive) method and then creates a new TreeSet instance backed by the NavigableMap instance returned by TreeMap.headMap(E toElement, boolean inclusive).

First, let's see the TreeSet constructor:

TreeSet(NavigableMap<E,Object> m) {
    this.m = m;
}

As you see, this is just an assignment.

Then, let's dig into the TreeMap.headMap(E toElement, boolean inclusive) method's code:

public NavigableMap<K,V> headMap(K toKey, boolean inclusive) {
    return new AscendingSubMap<>(this,
                                 true,  null,  true,
                                 false, toKey, inclusive);
}

This just creates and returns an instance of the AscendingSubMap static nested class. Here's AscendingSubMap constructor's code:

AscendingSubMap(TreeMap<K,V> m,
                boolean fromStart, K lo, boolean loInclusive,
                boolean toEnd,     K hi, boolean hiInclusive) {
    super(m, fromStart, lo, loInclusive, toEnd, hi, hiInclusive);
}

This just delegates to the superclass' constructor (AscendingSubMap's superclass is NavigableSubMap static nested abstract class). Here's NavigableSubMap constructor's code:

NavigableSubMap(TreeMap<K,V> m,
                boolean fromStart, K lo, boolean loInclusive,
                boolean toEnd,     K hi, boolean hiInclusive) {
    if (!fromStart && !toEnd) {
        if (m.compare(lo, hi) > 0)
            throw new IllegalArgumentException("fromKey > toKey");
    } else {
        if (!fromStart) // type check
            m.compare(lo, lo);
        if (!toEnd)
            m.compare(hi, hi);
    }

    this.m = m;
    this.fromStart = fromStart;
    this.lo = lo;
    this.loInclusive = loInclusive;
    this.toEnd = toEnd;
    this.hi = hi;
    this.hiInclusive = hiInclusive;
}

As you see, this is only checking the arguments for correctness and assigning them to properties.

Here m is the enclosing TreeMap instance (remember that NavigableSubMap is a static nested abstract class). The TreeMap.compare(Object k1, Object k2) method is as follows:

final int compare(Object k1, Object k2) {
    return comparator==null ? ((Comparable<? super K>)k1).compareTo((K)k2)
        : comparator.compare((K)k1, (K)k2);
}

This method just compares its arguments as appropriate and here it's used just to check the bounds of the submap. (Remember that TreeMap keys can be either Comparable or not. If they aren't, a Comparator must be provided when constructing the TreeMap instance, and this is what the comparator attribute is in the code above).

And this is all that is done when invoking the headSet method. tailSet follows the same pattern (just the bounds of the final submap are different).

So, as a conclusion, headSet and tailSet are actually O(1) worst case.

Note: I've checked the code for both JDK 8 v1.8.0_181 and openjdk version "11" 2018-09-25 versions. I'm pretty certain that the in-between versions haven't been changed either.


EDIT:

Concerning time complexity of accessing the first element of the view returned by headSet, i.e. if you were to iterate it, the implementation needs to perform an O(logN) operation to reach the leftmost leaf of the TreeSet (after all, a TreeSet is backed by a TreeMap, which in turn is implemented as a red/black tree).

Iterating the set view returned by tailSet has the same time complexity: O(logN). This is because the implementation needs to perform a search of the node whose value is closer to the provided lower bound, and this search is also O(logN).

like image 123
fps Avatar answered Oct 05 '22 16:10

fps