Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic order statistic: get k-th element in constant time?

So, I'm trying to implement a Data Structure to handle Dynamic Order Statistic. The Data Structure has following operations:

  • add(x): inserts a new element with value x
  • get(k): returns the k-th smallest element: k = ceiling(n/a), where n = amount of elements in the data structure and a = constant factor.
  • reset: resets the whole datastructuer, i.e. the data structure is "empty after it

I implemented my data structure using a balanced AVL tree. Using this the operations have following time complexity:

  • add(x): O(log(n))
  • get(k): O(log(n))

Here is my implemenation for the get(k) which uses O(log(n)) time:

public static int get(Node current, int k) {
    int l = tree.sizeLeft(current) + 1;
    if(k == l) {
        return current.value;
    } else if(k < l) {
        if(current.left == null) {
            return current.value;
        }
        return get(current.left, k);
    } else {
        if(current.right == null) {
            return current.value;
        }
        return get(current.right, k);
    }
}

And here's my implementation for the node class:

class Node {
int height, value, bal, size;   // bal = balanceFactor, size = amount of nodes in tree 
                                   rooted at current node
Node leftChild = null;
Node rightChild = null;

public Node(int val) {
    value = val;
    height = 1;
    size = 1; 
}

}

However, my task is to implement a data structure that can handle the above operations and only taking O(1) (constant) time for the operation get(k). (And add(x) still taking O(log(n)) time). Also, I'm not allowed to use a hashmap.

Is it possible to modify my implementation in order to get constant time? Or, what kind of datastructure can handle the get(k) operation in constant time?

like image 483
G.M Avatar asked Jan 06 '18 16:01

G.M


1 Answers

As far as I understand the k parameter basically grows with the size of the elements, which means that for each n you know the exact value of k.

If this is the case, then my suggestion is to use a max-heap and and a min-heap. The max-heap organizes elements (smallerequals than the n/a th element) in a heap structure, allowing to access the largest element (root) in constant time. Accordingly the min-heap organizes elements (larger than the n/a th element) in a heap structure, allowing to access the smallest element (root) in constant time.

When new elements arrive (add) you place them in the corresponding heap in O(log n). If the max-heap becomes larger or smaller than (n/a), you rebalance between the two heaps in O(log n)

Your get() function now just needs to return the root element of the max-heap in O(1).

In Java you can use a priority queue for the max-heap (and min-heap)

PriorityQueue<Integer> heap = new PriorityQueue<>(10, Collections.reverseOrder());

The class could look like this

import java.util.Collections;
import java.util.PriorityQueue;

public class DOS
{

    double a;
    PriorityQueue<Integer> heap;
    PriorityQueue<Integer> heap_large_elements;

    public DOS(double a) {
        this.a = a;
        this.heap = new PriorityQueue<>(10, Collections.reverseOrder());
        this.heap_large_elements = new PriorityQueue<>();
    }

    public void add(int x){
        if(heap.size() == 0 || x < heap.peek())
            heap.add(x); // O(log n/a)
        else
            heap_large_elements.add(x); // O(log n)

        //possible rebalance operations
        int n = heap.size() + heap_large_elements.size();
        if(heap.size() > Math.ceil(n/a)){
            heap_large_elements.add(heap.poll()); //O(log n)
        }else if(heap.size() < Math.ceil(n/a)) {
            heap.add(heap_large_elements.poll()); //O(log n)
        }
    }

    public int get(){
        return heap.peek(); //O(1)
    }

    public static void main(String[] args)
    {
        DOS d = new DOS(3);
        d.add(5);d.add(6);d.add(2);d.add(3);d.add(8);d.add(12);d.add(9);
        System.out.println(d.get());
    }

}

Edit (by Cheaty McCheatFace):

Another idea that lets you use your code but is somewhat cheaty, is the following. Whenever, you add an element to your AVL-Tree you calculate the k (=n/a) largest element (as done in your code) and store it. This way, the add()-function still has a O(log n) runtime. The get()-function just retrives the stored value and is in O(1).

like image 55
SaiBot Avatar answered Sep 21 '22 08:09

SaiBot