Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any technical reason why std::lower_bound is not specialized for red-black tree iterators?

I have always assumed that std::lower_bound() runs in logarithmic time if I pass a pair of red-black tree iterators (set::iterator or map::iterator) to it. I had to burn myself twice to notice that std::lower_bound() runs in O(n) time in this case, at least with the libstdc++ implementation. I understand that the standard doesn't have the concept of red-black tree iterators; std::lower_bound() will treat them as bidirectional iterators and advance them in linear time. I still don't see any reason why the implementation couldn't create an implementation specific iterator tag for red-black tree iterators and call a specialized lower_bound() if the passed in iterators happen to be red-black tree iterators.

Is there any technical reason why std::lower_bound() is not specialized for red-black tree iterators?


UPDATE: Yes, I am aware of the find member functions but it is so not the point. (In a templated code I may not have access to the container or work only on a part of container.)


After the bounty has expired: I find Mehrdad's and Yakk's answers the most convincing. I couldn't decide between the too; I am giving the bounty to Mehrdad and accepting Yakk's answer.

like image 541
Ali Avatar asked Jan 05 '14 14:01

Ali


3 Answers

There are multiple reasons:

  1. When using the non-member version a different predicate can be used. In fact, a different predicate has to be used when using, e.g., a std::map<K, V> as the map predicate operates on Ks while the range operates on pairs of K and V.
  2. Even if the predicate is compatible, the function has an interface using a pair of nodes somewhere in the tree rather than the root node which would be needed for an efficient search. Although it is likely that there are parent pointers, requiring so for the tree seems inappropriate.
  3. The iterators provided to the algorithm are not required to be the t.begin() and the t.end() of the tree. They can be somewhere in the tree, making the use of the tree structure potentially inefficient.
  4. The standard library doesn't have a tree abstraction or algorithms operating on trees. Although the associative ordered containers are [probably] implemented using trees the corresponding algorithms are not exposed for general use.

The part I consider questionable is the use of a generic name for an algorithm which has linear complexity with bidirectional iterators and logarithmic complexity with random access iterators (I understand that the number of comparisons has logarithmic complexity in both cases and that the movements are considered to be fast).

like image 124
Dietmar Kühl Avatar answered Oct 19 '22 17:10

Dietmar Kühl


There is no technical reason why this could not be implemented.

To demonstrate, I will sketch out a way to implement this.

We add a new Iterator category, SkipableIterator. It is a subtype of BiDirectionalIterator and a supertype of RandomAccessIterator.

SkipableIterators guarantee that the function between done in a context where std::between is visible works.

template<typeanme SkipableIterator>
SkipableIterator between( SkipableIterator begin, SkipableIterator end )

between returns an iterator between begin and end. It returns end if and only if ++begin == end (end is right after begin).

Conceptually, between should efficiently find an element "about half way between" begin and end, but we should be careful to allow a randomized skip list or a balanced red black tree to both work.

Random access iterators have a really simple implementation of between -- return (begin + ((end-begin)+1)/2;

Adding a new tag is also easy. Derivation makes existing code work well so long as they properly used tag dispatching (and did not explicitly specialize), but there is a small concern of breakage here. We could have "tag versions" where iterator_category_2 is a refinement of iterator_category (or soemthing less hacky), or we could use a completely different mechanism to talk about skipable iterators (an independent iterator trait?).

Once we have this ability, we can write a fast ordered searching algorithms that works on map/set and multi. It would also work on a skip list container like QList. It might be even the same implementation as the random access version!

like image 6
Yakk - Adam Nevraumont Avatar answered Oct 19 '22 18:10

Yakk - Adam Nevraumont


Great question. I honestly think there's no good/convincing/objective reason for this.

Almost all the reasons I see here (e.g. the predicate requirement) are non-issues to me. They might be inconvenient to solve, but they're perfectly solvable (e.g. just require a typedef to distinguish predicates).

The most convincing reason I see in the topmost answer is:

Although it is likely that there are parent pointers, requiring so for the tree seems inappropriate.

However, I think it's perfectly reasonable to assume parent pointers are implemented.

Why? Because the time complexity of set::insert(iterator, value) is guaranteed to be amortized constant time if the iterator points to the correct location.

Consider that:

  1. The tree must stay self-balancing.
  2. Keeping a tree balanced requires looking at the parent node at every modification.

How can you possibly avoid storing parent pointers here?

Without parent pointers, in order to ensure the tree is balanced after the insertion, the tree must be traversed starting from the root every single time, which is certainly not amortized constant time.

I obviously can't mathematically prove there exists no data structure that can provide this guarantee, so there's clearly the possibility that I'm wrong and this is possible.
However, in the absence of such data structures, what I'm saying is that this is a reasonable assumption, given by the fact that all the implementations of set and map I've seen are in fact red-black trees.


Side note, but note that we simply couldn't partially-specialize functions (like lower_bound) in C++03.
But that's not really a problem because we could have just specialized a type instead, and forwarded the call to a member function of that type.

like image 6
user541686 Avatar answered Oct 19 '22 18:10

user541686