Queue data structure supporting fast k-th largest element finding

Question

I'm faced with a problem which requires a Queue data structure supporting fast k-th largest element finding.

The requirements of this data structure are as follows:

The elements in the queue are not necessarily integers, but they must be comparable to each other, i.e we can tell which one is greater when we compare two elements(they can be equal as well).
The data structure must support enqueue(adds the element at the tail) and dequeue(removes the element at the head).
It can quickly find the k-th largest element in the queue, pls note k is not a constant.
You can assume that operations enqueue , dequeue and k-th largest element finding all occur with the same frequency.

enter image description here

My idea is to use a modified balanced binary search tree. The tree is the same as ordinary balanced binary search tree except that every node_i is augmented with another field n_i, n_i denotes the number of nodes contained in the subtree with root node_i. The aforementioned operations are supported as follows:

For simplicity assume that all elements are distinct.

Enqueue(x): x is first inserted into the tree, suppose the corresponding node is node_t, we append pair(x,pointer to node_t) to the queue.

Dequeue: suppose (e1, node₁) is the element at the head, node₁ is the pointer into the tree corresponding to e1. We delete node₁ from the tree and remove (e1, node₁) from the queue.

K-th largest element finding: suppose root node is node_root, its two children are node_left and node_right(suppose they all exist), we compare K with n_root , three cases may happen:

if K< n_left we find the K-th largest element in the left subtree of n_root;
if K>n_root-n_right we find the (K-n_root+n_right)-th largest element in the right subtree of n_root;
otherwise n_root is the node we want.

The time complexity of all the three operations are O(log_N) , where N is the number of elements currently in the queue.

How can I speed up the operations mentioned above? With what data structures and how?

amit · Accepted Answer

Note - you cannot achieve better then O(logn) for all, at best you need to "chose" which op you care for the most. (Otherwise, you could sort in O(n) by feeding the array to the DS, and querying 1st, 2nd, 3rd, ... nth elements)

Using a skip list instead of a Balanced BST as the sorted structure can reduce dequeue complexity to O(1) average case. It does not affect complexity of any other op.
To remove from a skip list - all you need to do is to get to the element using the pointer from the head of the queue, and follow the links up and remove each. The expected number of nodes needed to be deleted is 1 + 1/2 + 1/4 + ... = 2.
find Kth can be achieved in O(logK) by starting from the leftest node (and not the root) and making your way up until you find you have "more sons then needed", and then treat the just found node as the root just like the algorithm in the question. Though it is better in asymptotic complexity - the constant factor is double.

Heejin · Answer

I found an interesting paper:

Sliding-Window Top-k Queries on Uncertain Streams published in VLDB 2008 and cited by 71.

https://www.cse.ust.hk/~yike/wtopk.pdf

VLDB is the best conference in database research area, and the number of citations proves the data structure actually works.

The paper looks pretty difficult, but if you really need improve your data structure, I suggest you to read this paper or papers in the reference page of this paper.

Queue data structure supporting fast k-th largest element finding

Tags:

outlaw

2 Answers

amit

Heejin

Recent Activity

Donate For Us

Queue data structure supporting fast k-th largest element finding

Tags:

outlaw

2 Answers

amit

Heejin

Related questions

Recent Activity

Donate For Us