Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to search a number in a list of ranges

I have the following code to find a match for a number in a list of ranges.

public class RangeGroup
{
    public uint RangeGroupId { get; set; }
    public uint Low { get; set; }
    public uint High { get; set; }
    // More properties related with the range here
}

public class RangeGroupFinder
{
    private static readonly List<RangeGroup> RangeGroups=new List<RangeGroup>();

    static RangeGroupFinder()
    {
        // Populating the list items here
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023238144, High = 1023246335 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023246336, High = 1023279103 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023279104, High = 1023311871 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023311872, High = 1023328255 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023328256, High = 1023344639 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023344640, High = 1023410175 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023410176, High = 1023672319 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023672320, High = 1023688703 });
        RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023692800, High = 1023696895 });
       // There are many more and the groups are not sequential as it can seen on last 2 groups
    }

    public static RangeGroup Find(uint number)
    {
        return RangeGroups.FirstOrDefault(rg => number >= rg.Low && number <= rg.High);
    }
}

The list of the RangeGroup consists about 5000000 items and the Find() method will be used a lot, so I'm looking for a faster way to make the search. It's no problem to change the structure of the data or split it in any way.

Edit:

All ranges are unique and added by in order of Low and they don't overlap.

Result:

Did a test using ikh's code and the result is approximately 7000 times faster than my code. The test code and results can be seen here.

like image 938
M. Mennan Kara Avatar asked Aug 08 '12 16:08

M. Mennan Kara


4 Answers

Since you indicated that RangeGroups are added in order of RangeGroup.Low and that they do not overlap, you don't need to do any further pre-processing. You can do binary search on the RangeGroups list to find the range (warning: not fully tested, you'd need to check some edge conditions):

public static RangeGroup Find(uint number) {
    int position = RangeGroups.Count / 2;
    int stepSize = position / 2;

    while (true) {
        if (stepSize == 0) {
            // Couldn't find it.
            return null;
        }

        if (RangeGroups[position].High < number) {
            // Search down.
            position -= stepSize;

        } else if (RangeGroups[position].Low > number) {
            // Search up.
            position += stepSize;

        } else {
            // Found it!
            return RangeGroups[position];
        }

        stepSize /= 2;
    }
}

The worst-case run time should be around O(log(N)), where N is the number of RangeGroups.

like image 154
ikh Avatar answered Sep 19 '22 20:09

ikh


Interval trees. were created exatcly for what you are asking for.

like image 21
alinsoar Avatar answered Sep 19 '22 20:09

alinsoar


If you populate the list only once you can do a magic trick:

  • Sort the List
  • Use BinarySearch

Sort takes O(Nlog(N)) time and is only done once. Binary search takes O(log(N)), which takes at most 17 comparisons for 100.000 items.

like image 40
oleksii Avatar answered Sep 17 '22 20:09

oleksii


May be use a sorted list and do a binary search. That way you reduce the number of comparisons to O(logN)

like image 32
Jeff Avatar answered Sep 20 '22 20:09

Jeff