binary search efficiency vs. linear search efficiency in fortran

Q: Is binary search more efficient than linear search?

Efficiency: Binary search is faster (in terms of scan cycles) and more efficient compared to linear search especially for larger data sets.

Q: Why binary search is more efficient than linear search in worst cases?

As we can see, binary search is much more efficient than linear search, since every time we only need to search through half of the remaining array. In fact, binary search only has a worst case run-time complexity of O(log n) (log by convention is base 2).

Q: Is binary search more efficient?

Binary search is faster than linear search except for small arrays. However, the array must be sorted first to be able to apply binary search. There are specialized data structures designed for fast searching, such as hash tables, that can be searched more efficiently than binary search.

Q: How much faster is binary search than linear search?

We must consider choosing the best sorting algorithm. According to the simulation, it concludes that the Binary search algorithm is 1,000 times faster than the Linear search algorithm.

Tags:

performance

binary-search

fortran

fortran77

linear-search

This question is about the efficiency of a linear search vs. the efficiency of a binary search for a pre-sorted array in contiguous storage...

I have an application written in fortran (77!). One frequent operation for my part of the code is to find the index in an array such that gx(i) <= xin < gx(i+1). I've currently implemented this as a binary search -- sorry for the statement labels and goto -- I've commented what the equivalent statments would be using fortran 90...

        i=1
        ih=nx/2
201     continue  !do while (.true.)
           if((xin.le.gx(i)).and.(xin.gt.gx(i+1)))then  !found what we want
              ilow=i+1; ihigh=i
              s1=(gx(ihigh)-xin)/(gx(ihigh)-gx(ilow))
              s2=1.0-s1
              return
           endif
           if(i.ge.ih)then
              goto 202 !exit
           endif
           if(xin.le.(gx(ih))then !xin is in second half of array
              i=ih
              ih=nx-(nx-ih)/2
           else !xin is in first half of array
              i=i+1
              ih=i+(ih-i)/2
           endif
        goto 201  !enddo

However, today, I was reading on Wikipedia about binary search and I came across this:

Binary search can interact poorly with the memory hierarchy 
(i.e. caching), because of its random-access nature. For 
in-memory searching, if the span to be searched is small, a
linear search may have superior performance simply because 
it exhibits better locality of reference.

I don't completely understand this statement -- my impression was that cache fetches were gathered in large(ish) chunks at a time, so if we start at the beginning of the array, I thought that most of the array would be in cache already (at least as much as it would be for a linear search), so I didn't think that would matter.

So my question is, is there any way to tell which algorithm will perform better (linear or binary search?) Is there an array size boundary? I'm currently using arrays of size around 100 elements...

567

asked May 09 '12 21:05

mgilson

1 Answers

For small arrays, the problem is not cache. You are right: A small array is likely to be cached quickly.

The problem is that branch prediction is likely to fail for binary search because branches are taken or skipped at random in a data-dependent way. Branch prediction misses stall the CPU pipeline.

This effect can be severe. You can easily search 3 to 8 elements linearly in the same time it takes to do a single binary search branch (and you need to do multiple binary search branches). The exact break even point needs to be measured.

Stalling the CPU pipeline is extremely expensive. A Core i7 can retire up to 4 instructions per clock cycle (12 giga-instructions per second at 3 GHz!). But only, if you are not stalling.

There are branch-free algorithms doing binary search by using conditional-move CPU instructions. These algorithms basically unroll 32 search steps and use a CMOV in each step (32 steps are the theoretical maximum). They are branch-free but not stall free: Each next step depends 100% on the previous one so the CPU cannot charge ahead in the instruction stream. It has to wait all the time. So they don't solve this problem, only improve it slightly.

110

answered Sep 21 '22 00:09

usr

Related questions
                            
                                Which Intel microarchitecture introduced the ADC reg,0 single-uop special case?
                            
                                Is it possible to use CompUnit modules for collected data?
                            
                                Would performance suffer using autoload in php and searching for the class file?
                            
                                How to speed up the eclipse project 'refresh'
                            
                                What causes a page to render slow?
                            
                                Poor Performance with WindowsTokenRoleProvider
                            
                                Mapping 2 vectors - help to vectorize
                            
                                How can we keep OpenX from blocking page load?
                            
                                One JavaScript File Per Page or Combine when using Jquery and Document Ready Function
                            
                                Performance implications of using GADTs
                            
                                How to do load testing using jmeter and visualVM?
                            
                                Is a long IN clause a code smell?
                            
                                Multi threaded insert using ORM?
                            
                                Creating a ping uptime service with PHP
                            
                                Why are tests taking longer to run in TeamCity than when run directly in NUnit?
                            
                                How can I make displaying images on WPF more “snappy”?
                            
                                MongoDB performance with growing data structure
                            
                                Is order of the predicate important when using LINQ?
                            
                                std::map standard allocator performance versus block allocator
                            
                                Class VS ref Struct

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With