Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prove a lower bound logn for a data structure?

I have a homework question as follows (note that I am not looking for exact answers, just looking for simple suggestions to move on).

S is a data structure which supports Insert(x,S), Delete(x, S) and Find_Smallest_Item(S) in time <= T(n). Prove a lower bound for T(n) such as Ω(logn).

Here is my thoughts so far:

I guess I need to find a reduction where I will reduce this problem into an easier one, and prove it cannot be lower than logn. I read many tutorials about lower bounds and most of them reduced a problem into sorting, then they used sorting as a black box and proved the algorithm cannot be lower than nlogn.

But here, we are dealing with logn and I don't know such an algorithm to reduce to. Maybe it's gotta do something with the depth of a tree structure, logn. But I couldn't figure out where to start.

Can you suggest me some hints?

Edit: Actually something came to my mind but I don't know if I am supposed to prove a lower bound with a trick like that. So, I assume I have insert, delete and find_smallest operations, each of them has a logn time complexity.

For example, to construct a sorted list, I can use delete and find_smallest functions, e.g. I can run find_smallest for the first time, and after finding the smallest element in the list, then I will delete the element. I will run it once again and therefore I will find 2nd smallest element, and so on.

Therefore, I can implement sorting by using deletion and find_smallest functions. So if I continue doing this n times, each of them will take logn (for deletion) + logn (for finding smallest), so in overall, sorting will take nlogn.

I don't know how to tweak this for insertion, though.

Edit 2: In order to use insertion for the proof: after finding the ith smallest element in the list, what if I insert it to ith place? e.g. after finding 3rd smallest element with the procedure above, I can insert it to 3rd index of the data structure. Therefore, at the end I will obtain a sorted data structure.

like image 397
Bin Avatar asked Oct 04 '11 01:10

Bin


People also ask

How do you prove a lower bound?

A lower bound for a problem is the worst-case running time of the best possible algorithm for that problem. To prove a lower bound of Ω(n lg n) for sorting, we would have to prove that no algorithm, however smart, could possibly be faster, in the worst-case, then n lg n.

Which notation shows the lower bound for the algorithm?

The lower bound for an algorithm (or a problem, as explained later) is denoted by the symbol Ω, pronounced “big-Omega” or just “Omega”.

How can the bounds be determined for any sorting algorithm?

Alternatively, any sorting algorithm must at least look at every input value to recognize whether the input values are in sorted order. So, based on our current knowledge of sorting algorithms and the size of the input, we know that the problem of sorting is bounded by Ω(n) and O(nlogn).

How do you find the upper bound and lower bound of a function?

If you divide a polynomial function f(x) by (x - c), where c > 0, using synthetic division and this yields all positive numbers, then c is an upper bound to the real roots of the equation f(x) = 0. Note that two things must occur for c to be an upper bound. One is c > 0 or positive.


2 Answers

Reducing your problem to another one will give you an upper bound on your O(), not a lower one.

On the other hand, if you can use any solution to your problem to implement some other algorithm with a well-known lower bound (effectively reducing that problem to yours), that may give you the lower bound you're looking for.


Answer:

As others have suggested, you can use the datastructure S to implement sorting:

for i in range(input):
    Insert(S, input[i])
for i in range(input):
    x = Find_Smallest_Item(S)
    output[i] = x
    Delete(S, x)

For input of size N this algorithm makes N calls to each of your three operations. However, we know that any general purpose sorting algorithm must have a worst case of O(N log N).

This means there will be cases for which the average time of the datastructure calls in the above sort is O(log N) per call. Since that is incompatible with any T() asymptotically better than log N, you have your lower bound.

Some notes:

  • The kind of datastructure described in your problem is called a priority queue.

  • Since you're trying to prove a lower bound on any possible priority queue, you can't make an assumption about the implementation. Just because a particular datastructure gives you a particular O() performance, doesn't mean some completely different datastructure couldn't be better.

  • There are many priority queue datastructures that satisfy O(log N) for all calls, so this is actually a tight lower bound.

like image 132
comingstorm Avatar answered Oct 27 '22 10:10

comingstorm


Your first edit isn't correct.

I can run find_smallest for the first time, and after finding the smallest element in the list,

Running Find_Smallest_Item(S) will find the smallest element in S, not the smallest element in the list. You first need to Insert(elements from the list,S) before there is anything to find!

Maybe you should try to write down some code, like this:

List l = [1,25,4,3,7]
S S         // empty
List sorted // empty
//now we can do:
insert(l[0],S)
//or
delete(25,S)
//magic goes here
...
//and l is sorted!

The trick is to write down code that sorts l (or makes another sorted list). To prove the lower bound, count the steps(inserts,deletes and findmins) and use the fact that no matter what code you write it can't be (worst case) faster than O(nlogn) (for comparison sorts). This gives a lower bound, it has to be at least this slow.

like image 44
Ishtar Avatar answered Oct 27 '22 10:10

Ishtar