Clarification of statement of performance of collection's binary search from javadoc

Question

I am confused on the performance analysis of binarySearch from the Collections

It says:

If the specified list does not implement the RandomAccess interface and is large, this method will do an iterator-based binary search that performs O(n) link traversals and O(log n) element comparisons.

I am not sure how to interpret this O(n) + O(log n).

I mean isn't it worse than simply traversing the linked-list and compare? We still get only O(n).

So what does this statement mean about performance? As phrased, I can not understand the difference from a plain linear search in the linked list.

What am I missunderstanding here?

Tomasz Nurkiewicz · Accepted Answer

First of all you must understand that without RandomAccess interface the binarySearch cannot simply access, well, random element from the list, but instead it has to use an iterator. That introduces O(n) cost. When the collection implements RandomAccess, cost of each element access is O(1) and can be ignored as far as asymptotic complexity is concerned.

Because O(n) is greater than O(log n) it will always take precedence over O(log n) and dominate the complexity. In this case binarySearch has the same complexity as simple linear search. So what is the advantage?

Linear search performs O(n) comparisons, as opposed to O(log n) with binarySearch without random access. This is especially important when the constant before O(logn) is high. In plain English: when single comparison has a very high cost compared to advancing iterator. This might be quite common scenario, so limiting the number of comparisons is beneficial. Profit!

Groo · Answer

Binary search is not suited for linked lists. The algorithm is supposed to benefit from a sorted collection with random access (like a plain array), where it can quickly jump from one element to another, splitting the remaining search space in two on each iteration (hence the O(log N) time complexity).

For a linked list, there is a modified version which iterates through all elements (and needs to go through 2n elements in the worst case), but instead of comparing every element, it "probes" the list at specified positions only (hence doing a lower number of comparisons compared to a linear search).

Since comparisons are usually slightly more costly compared to plain pointer iterating, total time should be lower. That is why the log N part is emphasized separately.

Clarification of statement of performance of collection's binary search from javadoc

Tags:

java

algorithm

complexity-theory

list

collections

Cratylus

2 Answers

Tomasz Nurkiewicz

Groo

Recent Activity

Donate For Us

Clarification of statement of performance of collection's binary search from javadoc

Tags:

java

algorithm

complexity-theory

list

collections

Cratylus

2 Answers

Tomasz Nurkiewicz

Groo

Related questions

Recent Activity

Donate For Us