Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting arrays of intervals in perl?

Tags:

perl

I have an array in perl with some intervals like:

@array = qw(1-5 7-9 10-15 20-58 123-192 234-256)

I am trying to order it using sort but this is what I obtain:

1-5 , 10-15 , 123-192 , 20-58 , 234-256 , 7-9

It is sorted by the first character of the first number... How can be ordered by the whole first number in order to obtain the next array?

1-5 , 7-9 , 10-15 , 20-58 , 123-192 , 234-256

Thank you very much!

P.S.

I have no code for this, I am trying the command

my @sorted = sort @array;
like image 479
Shikari Avatar asked Dec 19 '22 09:12

Shikari


2 Answers

What you want to do is sort numerically. To do that, you need to override the default sort method by supplying your own method. This code:

my @sorted = sort @array;

really means this:

my @sorted = sort { $a cmp $b } @array;

Where cmp is the lexicographical comparison operator (sorts in alphabetic order, more or less). You want to use the <=>, often called "the spaceship operator".

my @sorted = sort { $a <=> $b } @array;

But this operator can only be used with numbers, and a string such as 7-9 is not really a number (although in this case it will work, albeit with warnings issued Argument "7-9" isn't numeric in sort).

To overcome this warning, and possible bug, we need to extract the numbers from the strings we want to sort. We do this with a regex match: /\d+/g. This will match and return all consecutive numbers in the string.

my @sorted = sort {
                     my ($a1, $a2) = $a =~ /\d+/g;
                     my ($b1, $b2) = $b =~ /\d+/g;
                     $a1 <=> $b1 || $a2 <=> $b2;
} @array;

We capture and use both low and high range, and at the end we perform both checks. This means that in the case that $a1 and $b1 are equal, <=> returns 0, and the || operator executes the alternate comparison, $a2 <=> $b2.

In some cases, this operation is expensive, it takes time, and with a large data set it will cause the sort to become very slow. In this case, we can cache the data using what is known as a Schwartzian transform. In this method, we simply store the value of our regex match and use the stored value when sorting. To do this, we use a anonymous array ref [ ... ]. At the end we restore the original value and discard the cache:

my @sorted = map  { $_->[0] }                # restore original
             sort { $a->[1] <=> $b->[1] }    # sort compares stored nums
             map  { [ $_, /\d+/g ] }         # store original, and nums
             @array;

If you want more than one sort level, you just add $a->[2] <=> $b->[2] and so on.

like image 151
TLP Avatar answered Jan 07 '23 19:01

TLP


You need to extract first number for every element, and do numerical comparison using <=> operator,

my @array = qw(1-5 7-9 10-15 20-58 123-192 234-256);
my @sorted = sort {
  my ($aa,$bb) = map /^([0-9]+)/, $a,$b; 

  $aa <=> $bb;
} @array;
like image 21
mpapec Avatar answered Jan 07 '23 19:01

mpapec