Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient algo to find number of integers in a sorted array that are within a certain range in O(log(N)) time?

I came across a interview question that has to be done in O(logn)

Given a sorted integer array and a number, find the start and end indexes of the number in the array.

Ex1: Array = {0,0,2,3,3,3,3,4,7,7,9} and Number = 3 --> Output = {3,6} 

Ex2: Array = {0,0,2,3,3,3,3,4,7,7,9} and Number = 5 --> Output = {-1,-1} 

I am trying to find an efficient algo for this but so fat have not been successful.

like image 738
Bhaskar Avatar asked Nov 07 '13 18:11

Bhaskar


3 Answers

You can use the concept of binary search to find the starting and ending index:

  • To find the starting index, you halve the array, if the value is equal to or greater than the input number, repeat with the lower half of the array, otherwise repeat with the higher half. stop when you reached an array of size 1.
  • To find the starting index, you halve the array, if the value is greater than the input number, repeat with the lower half of the array, otherwise repeat with the higher half. stop when you reached an array of size 1.

Note that when we reached an array of size 1, we may be one cell next to the input number, so we check if it equals the input number, if not, we fix the index by adding/decreasing 1 from the index we found.

findStartIndex(int[] A, int num)
{
    int start = 0; end = A.length-1;

    while (end != start) 
    {
        mid = (end - start)/2;

        if (A[mid] >= num)
            end = mid;
        else
            start = mid;
    }

    if(A[start] == num) 
        return start;
    else 
        return start+1;
}

findEndIndex(int[] A, int num)
{
    int start = 0; end = A.length-1;

    while (end != start) 
    {
        mid = (end - start)/2;

        if (A[mid] > num)
            end = mid;
        else
            start = mid;
    }

    if(A[start] == num) 
        return start;
    else 
        return start-1;
}

And the whole procedure:

int start = findStartIndex(A, num);

if (A[start]!=num) 
{ 
     print("-1,-1"); 
}
else
{
     int end = findEndIndex(A, num);
     print(start, end);
}
like image 160
Ron Teller Avatar answered Nov 07 '22 11:11

Ron Teller


Sounds like a binary search -- log graphs iirc represent the effect of "halving" with each increment, which basically is binary search.

Pseudocode:

Set number to search for
Get length of array, check if number is at the half point
if the half is > the #, check the half of the bottom half. is <, do the inverse
repeat
    if the half point is the #, mark the first time this happens as a variable storing its index
    then repeat binary searches above , and then binary searches below (separately), such that you check for how far to the left/right it can repeat.
    note*: and you sort binary left/right instead of just incrementally, in case your code is tested in a dataset with like 1,000,000 3's in a row or something

Is this clear enough to go from there?

like image 25
HC_ Avatar answered Nov 07 '22 09:11

HC_


The solution is to binary search the array concurrently (does't actually have to be concurrent :P ) at the start. The key is that the left and right searches are slightly different. For the right side if you encounter a dupe you have to search to the right, and for the left side if you encounter a dupe you search to the left. what you are searching for is the boundary so on the right side you check for.

 yournum, not_yournum  

This is the boundary and on the left side you just search for the boundary in the opposite direction. At the end return the indices of the boundaries.

like image 40
aaronman Avatar answered Nov 07 '22 11:11

aaronman