Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient algorithm for finding the largest overlapping range given a list of ranges

Consider the following interface that describes a continuous range of integer values.

public interface IRange {
    int Minimum { get;}
    int Maximum { get;}

    IRange LargestOverlapRange(IEnumerable<IRange> ranges);
} 

I am looking for an efficient algorithm to find the largest overlap range given a list of IRange objects. The idea is briefly outlined in the following diagram. Where the top numbers represent the integer values, and the |-----| represent the IRange objects with a min and max value. I stacked the IRange objects so that the solution is easy to visualize.

0123456789  ...                            N
|-------|   |------------|        |-----|
   |---------|    |---|
       |---|             |------------|
               |--------|  |---------------|
                              |----------|

Here, the LargestOverlapRange method would return:

                                  |---|

Since that range has a total of 4 'overlaps'. If there are two separate IRange with the same number of overlaps, I want to return null.

Here is some brief code of what I tried.

public class Range : IRange 
{

    public IRange LargestOverlapRange(IEnumerable<IRange> ranges) {           

        int maxInt = 20000;    

        // Create a histogram of the counts
        int[] histogram = new int[maxInt];
        foreach(IRange range in ranges) {
            for(int i=range.Minimum; i <= range.Maximum; i++) {
                histogram[i]++;
            }
        }

        // Find the mode of the histogram
        int mode = 0;
        int bin = 0;
        for(int i =0; i < maxInt; i++) {
            if(histogram[i] > mode) {
                mode = histogram[i];
                bin = i;
            }
        }

        // Construct a new range of the mode values, if they are continuous
        Range range;
        for(int i = bin; i < maxInt; i++) {
            if(histogram[i] == mode) {  
                if(range != null)
                    return null; // violates two ranges with the same mode   
                range = new Range();             
                range.Minimum = i;                     
                while(i < maxInt && histrogram[i] == mode)
                    i++;
                range.Maximum = i;                    
            }
        }

        return range;
    }

}

This involves four loops and is easily O(n^2) if not higher. Is there a more efficient algorithm (speed wise) to find the largest overlap range from a list of other ranges?

EDIT

Yes, the O(n^2) is not correct, I was thinking about it incorrectly. It should be O(N * M) as was pointed out in the comments.

EDIT 2

Let me stipulate a few things, the absolute min and max values of the integer values will be from (0, 20000). Secondly, the average number of IRange will be on the order of 100. I don't know if this will change the way the algorithm is designed.

EDIT 3

I am implementing this algorithm on a scientific instrument (a mass spectrometer) in which the speed of the data processing is paramount to the quality of data (faster analysis time = more spectra collected in time T). The firmware language (proprietary) only has arrays[] and is not object orientated. I choose C# since I am decent at porting concepts between the two languages and thought that in the interest of the SO community, a good answer would have a wider audience.

like image 519
Moop Avatar asked Mar 06 '13 17:03

Moop


1 Answers

Convert your list of ranges to a list of start and stop points. Sort the list with an O(n log n) algorithm. Now you can iterate through the list and increment or decrement a counter depending on whether it's a start or stop point, which will give you the current overlap depth.

like image 185
Mark Ransom Avatar answered Oct 07 '22 07:10

Mark Ransom