Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

binary search efficiency vs. linear search efficiency in fortran

This question is about the efficiency of a linear search vs. the efficiency of a binary search for a pre-sorted array in contiguous storage...

I have an application written in fortran (77!). One frequent operation for my part of the code is to find the index in an array such that gx(i) <= xin < gx(i+1). I've currently implemented this as a binary search -- sorry for the statement labels and goto -- I've commented what the equivalent statments would be using fortran 90...

        i=1
        ih=nx/2
201     continue  !do while (.true.)
           if((xin.le.gx(i)).and.(xin.gt.gx(i+1)))then  !found what we want
              ilow=i+1; ihigh=i
              s1=(gx(ihigh)-xin)/(gx(ihigh)-gx(ilow))
              s2=1.0-s1
              return
           endif
           if(i.ge.ih)then
              goto 202 !exit
           endif
           if(xin.le.(gx(ih))then !xin is in second half of array
              i=ih
              ih=nx-(nx-ih)/2
           else !xin is in first half of array
              i=i+1
              ih=i+(ih-i)/2
           endif
        goto 201  !enddo

However, today, I was reading on Wikipedia about binary search and I came across this:

Binary search can interact poorly with the memory hierarchy 
(i.e. caching), because of its random-access nature. For 
in-memory searching, if the span to be searched is small, a
linear search may have superior performance simply because 
it exhibits better locality of reference.

I don't completely understand this statement -- my impression was that cache fetches were gathered in large(ish) chunks at a time, so if we start at the beginning of the array, I thought that most of the array would be in cache already (at least as much as it would be for a linear search), so I didn't think that would matter.

So my question is, is there any way to tell which algorithm will perform better (linear or binary search?) Is there an array size boundary? I'm currently using arrays of size around 100 elements...

like image 567
mgilson Avatar asked May 09 '12 21:05

mgilson


People also ask

Is binary search more efficient than linear search?

Efficiency: Binary search is faster (in terms of scan cycles) and more efficient compared to linear search especially for larger data sets.

Why binary search is more efficient than linear search in worst cases?

As we can see, binary search is much more efficient than linear search, since every time we only need to search through half of the remaining array. In fact, binary search only has a worst case run-time complexity of O(log n) (log by convention is base 2).

Is binary search more efficient?

Binary search is faster than linear search except for small arrays. However, the array must be sorted first to be able to apply binary search. There are specialized data structures designed for fast searching, such as hash tables, that can be searched more efficiently than binary search.

How much faster is binary search than linear search?

We must consider choosing the best sorting algorithm. According to the simulation, it concludes that the Binary search algorithm is 1,000 times faster than the Linear search algorithm.


1 Answers

For small arrays, the problem is not cache. You are right: A small array is likely to be cached quickly.

The problem is that branch prediction is likely to fail for binary search because branches are taken or skipped at random in a data-dependent way. Branch prediction misses stall the CPU pipeline.

This effect can be severe. You can easily search 3 to 8 elements linearly in the same time it takes to do a single binary search branch (and you need to do multiple binary search branches). The exact break even point needs to be measured.

Stalling the CPU pipeline is extremely expensive. A Core i7 can retire up to 4 instructions per clock cycle (12 giga-instructions per second at 3 GHz!). But only, if you are not stalling.

There are branch-free algorithms doing binary search by using conditional-move CPU instructions. These algorithms basically unroll 32 search steps and use a CMOV in each step (32 steps are the theoretical maximum). They are branch-free but not stall free: Each next step depends 100% on the previous one so the CPU cannot charge ahead in the instruction stream. It has to wait all the time. So they don't solve this problem, only improve it slightly.

like image 110
usr Avatar answered Sep 21 '22 00:09

usr