Suppose there is an array containing unsorted data and I need to choose either linear search or binary search for searching. Then which option should I choose? The time complexity for linear search is O(n) and for binary search is O(log n). But, the fastest sorting algorithm gives the time complexity of O(n * log n). Now, I don't know how to "add" complexities of two algorithms (if that's the right word) and hence, I am asking this question.
So my question is if sorting then binary searching is better than simply linear searching or is it the other way?
Plus, how do I prove whatever the case maybe using big O notation ( I mean "adding" and "comparing" the time complexities) ?
Thank you so much for reading!!! It means a lot.
The time complexity of the binary search algorithm is O(log n). The best-case time complexity would be O(1) when the central index would directly match the desired value.
The complexity is O(logn). Binary Search does not work for "un-Sorted" lists. For these lists just do a straight search starting from the first element; this gives a complexity of O(n). If you were to sort the array with MergeSort or any other O(nlogn) algorithm then the complexity would be O(nlogn).
You only have to do the sort once, and then binary search the resulting data set multiple times. The point of the search is not to generate a sorted array. It is to locate a specific value. The search requires a sorted array.
You don't really "add" the complexities. Sorting is, as you say, O(n * log n), and searching is O(log n). If you were to do "normal math" on them, then it would be (n+1)*log n, which is still n*log n.
When you're performing multiple steps like that, you typically take the highest complexity and call it that. After all, when n is sufficiently large, n*log n dwarfs log n.
Think of it this way: when n is 1,000,000, n*log n is 20 million. log n is 20. So what's the difference between 20,000,000 and 20,000,020? The (log n) term is irrelevant. So (n log n) + (log n) is, for all intents and purposes, equal to (n log n). Even when n is 100, log n is 7. The (log n) term just won't make a difference when n is even moderately large.
In your particular case, if you only need to search the list one time, then sequential search is the way to go. If you need to search it multiple times, then you have to weigh the cost of m searches O(m * n) against the cost of sorting and then searching. If you're interested in the minimum time and you know how many times you'll be searching the list, then you'd use sequential search if (m*n) is less than (n * log n). Otherwise use the sort and then binary search.
But that's not the only consideration. Binary search on a sorted list gives you very quick response time, whereas linear search can take a very long time for a single item. If you can afford to sort the list during program startup then that's probably the best way to go because items will be found (or not found) much faster once the program is operating. Sorting the list gives you better response time. It's better to pay the price of sorting during startup than to experience very unpredictable response times during operation. Or to find out that you need to do more searches than you thought. . .
If you have to do one search, do linear search. It's obviously better than sorting and then binary search.
But if you have multiple search queries, you in most cases should first sort the array, and then apply a binary search to every query.
Why ? Let's say you're going to perform O(k) search queries. If you do a linear search, you'll end up with O(n*k) operations. If you first sort, that will take O(nlogn) + O(klogn) = O((n+k)logn) operations. What is better ? When k is very small (less than logn), it's better to do linear search. However in most cases you'd better to sort first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With