Is binary search optimal in worst case? My instructor has said so, but I could not find a book that backs it up. We start with an ordered array, and in worst case(worst case for that algorithm), any algorithm will always take more pairwise comparisons than binary search.
Many people said that the question was unclear. Sorry! So the input is any general sorted array. I am looking for a proof which says that any search algorithm will take at least log2(N) comparisons in worst case(worst case for the algo in consideration).
For a binary search, the best-case occurs when the target is at the end of the search list. For a binary search, the worst-case is when the target item is not in the search list. For a binary search, the worst-case is when the target is found in the middle of the search list.
Binary search is more efficient than the linear search in the case of large data sets.
The biggest problem with a binary search is that you can only use this if the data is sorted into an order.
Now, the question arises, is Binary Search applicable on unsorted arrays? So, the answer is NO, it is not possible to use or implement Binary Search on unsorted arrays or lists, because, the repeated targeting of the mid element of one half depends on the sorted order of data structure.
Yes, binary search is optimal.
This is easily seen by appealing to information theory. It takes log N
bits merely to identify a unique element out of N
elements. But each comparison only gives you one bit of information. Therefore, you must perform log N
comparisons to identify a unique element.
More verbosely... Consider a hypothetical algorithm X that outperforms binary search in the worst case. For a particular element of the array, run the algorithm and record the questions it asks; i.e., the sequence of comparisons it performs. Or rather, record the answers to those questions (like "true, false, false, true").
Convert that sequence into a binary string (1,0,0,1). Call this binary string the "signature of the element with respect to algorithm X". Do this for each element of the array, assigning a "signature" to each element.
Now here is the key. If two elements have the same signature, then algorithm X cannot tell them apart! All the algorithm knows about the array are the answers it gets from the questions it asks; i.e., the comparisons it performs. And if the algorithm cannot tell two elements apart, then it cannot be correct. (Put another way, if two elements have the same signature, meaning they result in the same sequence of comparisons by the algorithm, which one did the algorithm return? Contradiction.)
Finally, prove that if every signature has fewer than log N
bits, then there must exist two elements with the same signature (pigeonhole principle). Done.
[update]
One quick additional comment. The above assumes that the algorithm does not know anything about the array except what it learns from performing comparisons. Of course, in real life, sometimes you do know something about the array a priori. As a toy example, if I know that the array has (say) 10 elements all between 1 and 100, and that they are distinct, and that the numbers 92 through 100 are all present in the array... Then clearly I do not need to perform four comparisons even in the worst case.
More realistically, if I know that the elements are uniformly distributed (or roughly uniformly distributed) between their min and their max, again I can do better than binary search.
But in the general case, binary search is still optimal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With