Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

find kth smallest number in O(logn) time

Tags:

algorithm

Here is the problem, an unsorted array a[n], and I need to find the kth smallest number in range [i, j], and absolutely 1<=i<=j<=n, k<=j-i+1.

Typically I will use quick-find to do the job, but it is not fast enough if there many query requests with different range [i, j], I hardly to figure out a algorithm to do the query in O(logn) time (preprocessing is allowed).

Any idea is appreciated.

PS

Let me make the problem easier to understand. Any kinds of preprocessing is allowed, but the query needs to be done in O(logn) time. And there will be many (more than 1) queries, like find the 1st in range [3,7], or 3rd in range [10,17], or 11th in range [33, 52].

By range [i, j] I mean in the original array, not sorted or something.

For example, a[5] = {3,1,7,5,9}, query 1st in range [3,4] is 5, 2nd in range [1,3] is 5, 3rd in range [0,2] is 7.

like image 251
Alcott Avatar asked Mar 06 '13 01:03

Alcott


People also ask

What are the time complexity of finding KTH element?

The simplest solution is to sort the array and return the kth element. This solution has a time complexity of O(n*logn).

What is the average running time to find the kth smallest?

Finding the kth smallest element in an array with sorting To execute this, We first sort the array then access its k-1th index, which contains the kth smallest element of the array. K'th smallest element is 45. The time complexity of this method is O(N*logN) because of the sorting algorithm used in it.

What is the time complexity to find the smallest?

We have to find the smallest/ minimum element in an array. The time complexity to solve this is linear O(N) and space compexity is O(1). Our efficient approach can be seen as the first step of insertion sort.

How do you find the k th smallest element in an array?

K'th smallest element in an unsorted array using sorting:Sort the given array and return the element at index K-1 in the sorted array. Follow the given steps to solve the problem: Sort the input array in the increasing order. Return the element at the K-1 index (0 – Based indexing) in the sorted array.


2 Answers

If pre-processing is allowed and not counted towards the time complexity, just use that to construct sub-lists so that you can efficiently find the element you're looking for. As with most optimisations, this trades space for time.

Your pre-processing step is to take your original list of n numbers and create a number of new sublists.

Each of these sublists is a portion of the original, starting with the nth element, extending for m elements and then sorted. So your original list of:

 {3, 1, 7, 5, 9}

gives you:

 list[0][0] = {3}
 list[0][1] = {1, 3}
 list[0][2] = {1, 3, 7}
 list[0][3] = {1, 3, 5, 7}
 list[0][4] = {1, 3, 5, 7, 9}

 list[1][0] = {1}
 list[1][1] = {1, 7}
 list[1][2] = {1, 5, 7}
 list[1][3] = {1, 5, 7, 9}

 list[2][0] = {7}
 list[2][1] = {5, 7}
 list[2][2] = {5, 7, 9}

 list[3][0] = {5}
 list[3][1] = {5,9}

 list[4][0] = {9}

This isn't a cheap operation (in time or space) so you may want to maintain a "dirty" flag on the list so you only perform it the first time after you do an modifying operation (insert, delete, change).

In fact, you can use lazy evaluation for even more efficiency. Basically set all sublists to an empty list when you start and whenever you perform a modifying operation. Then, whenever you attempt to access a sublist and it's empty, calculate that sublist (and that one only) before trying to get the kth value out of it.

That ensures sublists are evaluated only when needed and cached to prevent unnecessary recalculation. For example, if you never ask for a value from the 3-through-6 sublist, it's never calculated.

The pseudo-code for creating all the sublists is basically (for loops inclusive at both ends):

for n = 0 to a.lastindex:
    create array list[n]
    for m = 0 to a.lastindex - n
        create array list[n][m]
        for i = 0 to m:
            list[n][m][i] = a[n+i]
        sort list[n][m]

The code for lazy evaluation is a little more complex (but only a little), so I won't provide pseudo-code for that.

Then, in order to find the kth smallest number in the range i through j (where i and j are the original indexes), you simply look up lists[i][j-i][k-1], a very fast O(1) operation:

                +--------------------------+
                |                          |
                |                          v
1st in range [3,4] (values 5,9),   list[3][4-3=1][1-1-0] = 5
2nd in range [1,3] (values 1,7,5), list[1][3-1=2][2-1=1] = 5
3rd in range [0,2] (values 3,1,7), list[0][2-0=2][3-1=2] = 7
|             |                         ^    ^    ^
|             |                         |    |    |
|             +-------------------------+----+    |
|                                                 |
+-------------------------------------------------+

Here's some Python code which shows this in action:

orig = [3,1,7,5,9]
print orig

print "====="
list = []
for n in range (len(orig)):
    list.append([])
    for m in range (len(orig) - n):
        list[-1].append([])
        for i in range (m+1):
            list[-1][-1].append(orig[n+i])
        list[-1][-1] = sorted(list[-1][-1])
        print "(%d,%d)=%s"%(n,m,list[-1][-1])

print "====="
# Gives xth smallest in index range y through z inclusive.
x = 1; y = 3; z = 4; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
x = 2; y = 1; z = 3; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
x = 3; y = 0; z = 2; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
print "====="

As expected, the output is:

[3, 1, 7, 5, 9]
=====
(0,0)=[3]
(0,1)=[1, 3]
(0,2)=[1, 3, 7]
(0,3)=[1, 3, 5, 7]
(0,4)=[1, 3, 5, 7, 9]
(1,0)=[1]
(1,1)=[1, 7]
(1,2)=[1, 5, 7]
(1,3)=[1, 5, 7, 9]
(2,0)=[7]
(2,1)=[5, 7]
(2,2)=[5, 7, 9]
(3,0)=[5]
(3,1)=[5, 9]
(4,0)=[9]
=====
(1,3,4)=5
(2,1,3)=5
(3,0,2)=7
=====
like image 102
paxdiablo Avatar answered Oct 22 '22 00:10

paxdiablo


Current solution is O( (logn)^2 ). I am pretty sure it can be modified to run on O(logn). The main advantage of this algorithm over paxdiablo's algorithm is space efficiency. This algorithm needs O(nlogn) space, not O(n^2) space.

First, the complexity of finding kth smallest element from two sorted arrays of length m and n is O(logm + logn). Complexity of finding kth smallest element from arrays of lengths a,b,c,d.. is O(loga+logb+.....).

Now, sort the whole array and store it. Sort the first half and second half of the array and store it and so on. You will have 1 sorted array of length n, 2 sorted of arrays of length n/2, 4 sorted arrays of length n/4 and so on. Total memory required = 1*n+2*n/2+4*n/4+8*n/8...= nlogn.

Once you have i and j figure out the list of of subarrays which, when concatenated, give you range [i,j]. There are going to be logn number of arrays. Finding kth smallest number among them would take O( (logn)^2) time.

Example for the last paragraph: Assume the array is of size 8 (indexed from 0 to 7). You have the following sorted lists:

A:0-7, B:0-3, C:4-7, D:0-1, E:2-3, F:4-5, G:6-7.

Now construct a tree with pointers to these arrays such that every node contains its immediate constituents. A will be root, B and C are its children and so on.

Now implement a recursive function that returns a list of arrays.

def getArrays(node, i, j):
    if i==node.min and j==node.max:
        return [node];

    if i<=node.left.max:
        if j<=node.left.max:
            return [getArrays(node.left, i, j)];  # (i,j) is located within left node
        else:
            return [ getArrays(node.left, i, node.left.max), getArrays(node.right, node.right.min, j) ]; # (i,j) is spread over left and right node 
    else:
        return [getArrays(node.right, i, j)]; # (i,j) is located within right node
like image 24
ElKamina Avatar answered Oct 22 '22 01:10

ElKamina