Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is it possible to do binary search on a doubly-linked list in O(n) time?

I've heard that it's possible to implement binary search over a doubly-linked list in O(n) time. Accessing a random element of a doubly-linked list takes O(n) time, and binary search accesses O(log n) different elements, so shouldn't the runtime be O(n log n) instead?

like image 449
templatetypedef Avatar asked Oct 23 '13 23:10

templatetypedef


People also ask

Can binary search have runtime O logN on sorted linked list?

Binary search is used because it has a time complexity of O(N) for a sorted array. If we follow sequential access of Linked List, then it will take O(N) time to find the middle element. With this, the overall time complexity of Binary Search on Linked List will become O(N * logN).

Can I perform a binary search on a linked list?

Yes, Binary search is possible on the linked list if the list is ordered and you know the count of elements in list. But While sorting the list, you can access a single element at a time through a pointer to that node i.e. either a previous node or next node.

Why is not binary search suitable for a sorted linked list of numbers?

A linked list only allows sequential access, so binary search is impossible even if the list is sorted.

What is the time complexity of doubly linked list?

In a doubly-linked list, the time complexity for inserting and deleting an element is O(1).


1 Answers

It's technically correct to say that the runtime of binary search on a doubly-linked list is O(n log n), but that's not a tight upper bound. Using a slightly better implementation of binary search and a more clever analysis, it's possible to get binary search to run in time O(n).

The basic idea behind binary search is the following:

  • If the list is empty, the element being searched for doesn't exist.
  • Otherwise:
    • Look at the middle element.
    • If it matches the element in question, return it.
    • If it's bigger than the element in question, discard the back half of the list.
    • If it's smaller than the element in question, discard the front half of the list.

A naive implementation of binary search on a doubly-linked list would work by computing the indices to look up on each iteration (just like in the array case), then access each one by starting at the front of the list and scanning forward the appropriate number of steps. This is indeed very slow. If the element being searched for is at the very end of the array, the indices looked up would be n/2, 3n/4, 7n/8, etc. Summing up the work done in the worst case, we get

n / 2 + 3n/4 + 7n/8 + 15n/16 + ... (Θ(log n) terms)

≥ n / 2 + n / 2 + ... + n / 2 (Θ(log n) terms)

= Θ(n log n)

and

n / 2 + 3n/4 + 7n/8 + 15n/16 + ... (Θ(log n) terms)

≤ n + n + ... + n (Θ(log n) terms)

= Θ(n log n)

Therefore, the worst-case time complexity for this algorithm is Θ(n log n).

However, we can speed this up by a factor of Θ(log n) by being more clever with our approach. The reason the previous algorithm is slow is that every time we need to look up an element, we start the search from the beginning of the array. However, we don't need to do this. After looking up the middle element the first time, we're already in the middle of the array, and we know that the next lookup we're going to make will be either at position n / 4 or 3n / 4, which is only distance n / 4 from where we left off (compared to n / 4 or 3n / 4 if we start from the beginning of the array). What if we just trekked out from our stopping position (n / 2) to the next position, rather than restarting at the front of the list?

Here's our new algorithm. Begin by scanning to the middle of the array, which requires n / 2 steps. Then, determine whether to visit the element at the middle of the first half of the array or the middle of the second half of the array. Reaching there from position n / 2 only requires n / 4 total steps. From there, going to the midpoint of the quarter of the array containing the element takes only n / 8 steps, and going from there to the midpoint of the eighth of the array containing the element takes only n / 16 steps, etc. This means that the total number of steps made is given by

n / 2 + n / 4 + n / 8 + n / 16 + ...

= n (1/2 + 1/4 + 1/8 + ...)

≤ n

This follows from the fact that the sum of the infinite geometric series 1/2 + 1/4 + 1/8 + ... is 1. Therefore, the total work done in the worst case only Θ(n), which is much better than the Θ(n log n) worst case from before.

One last detail: why would you ever do this? After all, it already takes O(n) time to search a doubly-linked list for an element. One major advantage of this approach is that even though the runtime is O(n), we only end up doing O(log n) total comparisons (one per step of the binary search). This means that if comparisons are expensive, we might end up doing less work using a binary search than doing a normal linear search, since the O(n) comes from the work done walking the list rather than the work done making comparisons.

like image 130
templatetypedef Avatar answered Oct 25 '22 23:10

templatetypedef