Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can there be an algorithm faster than linear search?

I have heard that there is no faster algorithm faster than linear search (for an unsorted array), but, when I run this algorithm (linear):

public static void search(int[] arr, int value){
    for(int i = 0; i < arr.length; i++){
        if(arr[i] == value) return;
    }
}

With a random array of length 1000000, the average time to find a value is 75ns, but with this algorithm:

public static void skipSearch(int[] arr, int value){
    for(int i = 0; i < arr.length; i+=2){
        if(arr[i] == value) return;
    }
    for(int i = 1; i < arr.length; i+=2){
        if(arr[i] == value) return;
    }
}

I get a shorter average, 68ns?

Edit: A lot of you are saying that I didn't do a proper benchmark and this was by fluke, but I ran these functions 1000000 times and got the average. And every time I ran the functions 1000000 times, I got 75-76ns for the first algorithm, and 67-69ns for the second algorithm.

I used java's System.nanoTime() to measure this.

Code:

int[] arr = new int[1000];
Random r = new Random();
for(int i = 0; i < arr.length; i++){
    arr[i] = r.nextInt();
}
int N = 1000000;
long startTime = System.nanoTime();
for(int i = 0; i < N; i++){
    search(arr, arr[(int) Math.floor(Math.random()*arr.length)]);
}
System.out.println("Average Time: "+(System.nanoTime()-startTime)/(float)N+"ns");
startTime = System.nanoTime();
for(int i = 0; i < N; i++){
    skipSearch(arr, arr[(int) Math.floor(Math.random()*arr.length)]);
}
System.out.println("Average Skip Search Time: "+(System.nanoTime()-startTime)/(float)N+"ns");
like image 964
programmers5 Avatar asked Oct 29 '15 23:10

programmers5


People also ask

Is linear search the fastest?

Because of that, each item must be examined; thus, the best algorithm would require O(N) comparisons, where N is the number of elements. Any computer scientist knows this. For that reason, the fastest algorithm will be a linear search through the list.

What is more efficient than a linear search?

Binary search is more efficient than the linear search in the case of large data sets.

Is there an algorithm faster than binary search?

Interpolation search works better than Binary Search for a Sorted and Uniformly Distributed array. Binary Search goes to the middle element to check irrespective of search-key. On the other hand, Interpolation Search may go to different locations according to search-key.

What is faster linear search or sort?

A binary seach is O(log(m)) and is faster than a linear search of O(n). However one must first sort the data: O(n log(n)), which takes longer. So if the data is filled once, and then often sougth for, take a sorting and binary search.


2 Answers

It's quite possible that, as your search() methods do not return anything, and there isn't any action inside the loops, the JIT compiler in your JVM optimizes the code - in other words, modifies the byte-code before loading it to JVM so that both your search() methods most probably do not do (almost) anything. Which is most significant, it probably also completely removes the loops. JIT optimization is pretty smart, it can identify a lot of situations when it is not needed to load any code into JVM (however the code is in the byte-code .class file).

Then you measure just random numbers - not the real time complexity of your methods.

Read e.g. how to make sure no jvm and compiler optimization occurs, apply it and run your benchmark again.

Also change your search() methods so they return the index - thus making the life for the optimizer harder. However, sometimes it's surprisingly difficult to create a code which is impossible to be optimized :) Turning off the optimization (as in the link above) is more reliable.


Generally it doesn't make sense to benchmark unoptimized code. However, in this case the OP wants to measure a theoretical algorithm. He wants to measure the real number of passes. He has to ensure that the loops are actually performed. That's why he should turn the optimization off.

The OP thought that what he had measured was the speed of the algorithm, while in fact the algorithm had not even had a chance to run at all. Turning the JIT optimization off in this particular case fixes the benchmark.

like image 102
Honza Zidek Avatar answered Sep 27 '22 02:09

Honza Zidek


This is why we are not concerned about literally timing how long things take to execute and more how things grow in scale as the complexity of the inputs increases. Have a look at asymptotic runtime analysis:

https://en.wikipedia.org/wiki/Analysis_of_algorithms

like image 45
Colin Schoen Avatar answered Sep 23 '22 02:09

Colin Schoen