Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ith order statistic using C++'s STL

Given an empty array, I need to make two type of queries

  1. Inserting an element in the array

  2. Finding the index of some element k (obviously the array has to be kept sorted)

This can be done be using set container

set<int> st;
set.insert(t);

This will insert my element in O(log(n)).

And for 2nd query

set<int>::iterator it;
it = st.find(k);
idx = distance(st.begin(), it);

This takes O(n) time. (O(n) [for distance()[ + O(log(n) [for set::find()] ).

Is there any way to do both queries in O(log(n)) using the predefined containers of C++?

http://www.cplusplus.com/reference/stl/

like image 659
Aman Singhal Avatar asked Feb 04 '13 17:02

Aman Singhal


People also ask

What is ith order statistics?

The ith order statistic of a set of n elements is the ith smallest element. For example, the minimum of a set of elements is the first order statistic (i = 1), and the maximum is the nth order statistic (i = n). A median, informally, is the "halfway point" of the set.

What is an order statistic Tree explain and give an example?

In computer science, an order statistic tree is a variant of the binary search tree (or more generally, a B-tree) that supports two additional operations beyond insertion, lookup and deletion: Select(i) – find the i'th smallest element stored in the tree.

Which features should be supported by every node of order static tree?

It should support four operations: Insert, Delete, Select and Rank.


2 Answers

I don't think this is possible with the containers of the standard library since supporting access by index would require changing the implementation (add a counter to each node). This would increase the size of each node. And C++s philosophy is "don't pay what you don't use".

If you really need this, there's a countertree implementation suggested for boost (and it supports at least some of the C++11 features) which fulfills your requirements.

like image 200
ipc Avatar answered Oct 05 '22 23:10

ipc


No. It is not possible (with the predefined containers). The sequence containers of the C++ Standard Library have either:

  • O(1) random access and O(N) insertion/removal or
  • O(N) random access and O(1) insertion/removal

Note that deque is an exception, but only when the insertion/removal takes place at the ends of the array. The general case is still O(N).

Furthermore, the classification of iterators does not include a category for this case. You have the bidirectional iterators (those of list, set, multiset, map and multimap), which take O(N) time to jump to a random position, and the next category is for random access iterators (those of vector, deque and string). There is no intermediate category.

Adding a new category would not be trivial at all. The library also implements a lot of algorithms (like for_each) that work with containers. There is an implementation for every iterator category.

Order statistic trees have been proposed at Boost several times. As far as I know:

  1. 2004: First suggestion (I don't know if it was implemented)
  2. 2006: "Hierarchical Data Structures"
  3. 2006: AVL Array (renamed as "Rank List" in Boost)
  4. 2012: Counter tree

The main difficulty for them being accepted was the generalized opinion that they were not a benefit, but a hazard. Today's programmers are used to solve all the problems they know with the typical containers. Experienced programmers fear that newbies might blindly use the proposed container for everything, instead of choosing carefully.

like image 30
comocomocomocomo Avatar answered Oct 06 '22 00:10

comocomocomocomo