Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loop through two arrays deleting overlaps in perl

I have two sets of ranges, represented by [ start, stop ] values. Some of the ranges overlap, meaning that the start of one range is in between the [ start, stop ] of the other range. I'd like to make a new set of ranges that has no such overlap, and also doesn't include any new values in a range.

The ranges look like this:

@starts  @ends
      5    108 
      5    187
     44    187
     44    229 
     44    236 
     64    236 
    104    236
    580    644
    632    770

The output that I expect would be this:

@starts  @ends
      5    236
    580    770

This is because the first seven ranges overlap with the interval from 5 => 236, and the last two overlap with the interval from 632 => 770.

Here's the code that I tried:

$fix = 0;
foreach (@ends) {  
    if ($starts[$fix + 1] < $ends[$fix]) {
        splice(@ends, $fix, $fix);
        splice(@starts, $fix + 1, $fix + 1);
    } else {
        $fix += 1;
    }
}

I can print out the values myself, I just need help with the algorithm for merging.

like image 382
Orion Avatar asked Jun 30 '11 16:06

Orion


2 Answers

This edits your arrays in-place, simply collapsing boundaries when they overlap.

# Since they're sorted by @starts, accept the 0th interval, start at 1
for (1..$#starts) {
    # extra check on array bounds, since we edit in-place
    last unless $_ < @starts;
    # don't need to collapse if no overlap with previous end
    next unless $starts[$_] <= $ends[$_-1];
    # delete this start and the previous end
    splice(@starts,$_,1);
    splice(@ends,$_-1,1);
    # rerun this loop for the same value of $_ since it was deleted
    redo;
}
like image 52
Chad A. Davis Avatar answered Sep 24 '22 07:09

Chad A. Davis


I think that this is what you want. You have a series of ranges of the form [start,stop], and you'd like to merge the overlapping ranges. The approach below is fairly simple.

  1. There are two sets of ranges, the original set and the merged set.
  2. You add the first range to the set of merged (non-overlapping) ranges. For each candidate range left from the original set, you make a choice:
    • If that candidate overlaps with one already in the merged set, you extend the boundaries of the range in the merged set appropriately.
    • If there is no overlap between the candidate range and any range in the merged set, you add the candidate to the merged set.

Hopefully this makes sense. It's not too obvious from your question that this is what you wanted, so let me know if this isn't right.

#!/usr/bin/perl

use strict;
use warnings;

my @starts = qw/ 5 5 44 44 44 64 104 580 632 /;
my @ends   = qw/ 108 187 187 229 236 236 236 644 770 /;

my @ranges;
while ( @starts && @ends ) {
    my $s = shift @starts;
    my $e = shift @ends;
    push @ranges, [ $s, $e ];
}

my @merged_ranges;
push @merged_ranges, shift @ranges;

foreach my $range (@ranges) {
    my $overlap = 0;
    foreach my $m_range (@merged_ranges) {
        if ( ranges_overlap($range,$m_range) ) {
            $overlap = 1;
            $m_range = merge_range($range,$m_range);
        }
    }
    if ( !$overlap ) {
        push @merged_ranges, $range;
    }
}

print join ' ', qw/ start end /;
print "\n";
foreach my $range (@merged_ranges) {
    print join ' ', ( $range->[0], $range->[1] );
    print "\n";
}

sub ranges_overlap {
    my $r1 = shift;
    my $r2 = shift;

    return ( $r1->[0] <= $r2->[1] && $r2->[0] <= $r1->[1] );
}

sub merge_range {
    my $r1 = shift;
    my $r2 = shift;
    use List::Util qw/ min max/;

    my $merged = [ min($r1->[0],$r2->[0]), max($r1->[1],$r2->[1]) ];
    return $merged;
}
like image 27
James Thompson Avatar answered Sep 24 '22 07:09

James Thompson