I have the following code to find a match for a number in a list of ranges.
public class RangeGroup
{
public uint RangeGroupId { get; set; }
public uint Low { get; set; }
public uint High { get; set; }
// More properties related with the range here
}
public class RangeGroupFinder
{
private static readonly List<RangeGroup> RangeGroups=new List<RangeGroup>();
static RangeGroupFinder()
{
// Populating the list items here
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023238144, High = 1023246335 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023246336, High = 1023279103 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023279104, High = 1023311871 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023311872, High = 1023328255 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023328256, High = 1023344639 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023344640, High = 1023410175 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023410176, High = 1023672319 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023672320, High = 1023688703 });
RangeGroups.Add(new RangeGroup { RangeGroupId = 0, Low = 1023692800, High = 1023696895 });
// There are many more and the groups are not sequential as it can seen on last 2 groups
}
public static RangeGroup Find(uint number)
{
return RangeGroups.FirstOrDefault(rg => number >= rg.Low && number <= rg.High);
}
}
The list of the RangeGroup consists about 5000000 items and the Find() method will be used a lot, so I'm looking for a faster way to make the search. It's no problem to change the structure of the data or split it in any way.
Edit:
All ranges are unique and added by in order of Low and they don't overlap.
Result:
Did a test using ikh's code and the result is approximately 7000 times faster than my code. The test code and results can be seen here.
Since you indicated that RangeGroup
s are added in order of RangeGroup.Low
and that they do not overlap, you don't need to do any further pre-processing. You can do binary search on the RangeGroups
list to find the range (warning: not fully tested, you'd need to check some edge conditions):
public static RangeGroup Find(uint number) {
int position = RangeGroups.Count / 2;
int stepSize = position / 2;
while (true) {
if (stepSize == 0) {
// Couldn't find it.
return null;
}
if (RangeGroups[position].High < number) {
// Search down.
position -= stepSize;
} else if (RangeGroups[position].Low > number) {
// Search up.
position += stepSize;
} else {
// Found it!
return RangeGroups[position];
}
stepSize /= 2;
}
}
The worst-case run time should be around O(log(N)), where N is the number of RangeGroups.
Interval trees. were created exatcly for what you are asking for.
If you populate the list only once you can do a magic trick:
Sort takes O(Nlog(N)) time and is only done once. Binary search takes O(log(N)), which takes at most 17 comparisons for 100.000 items.
May be use a sorted list and do a binary search. That way you reduce the number of comparisons to O(logN)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With