I'm trying to sort lists of names with Perl with a specific letter order to perform some special features.
The sorting would be working the same way as sort { $a cmp $b }
but with a different succession of letters.
For example, ordering with the arbitrary character order "abdrtwsuiopqe987654" ...
I tried to deal with sort { $a myFunction $b }
but I'm newbie with Perl and I don't see how to organize correctly myFunction
to get what I want.
cmp
function implemented with Perl to see how it works ?Using the toCharArray() method Get the required string. Convert the given string to a character array using the toCharArray() method. Sort the obtained array using the sort() method of the Arrays class. Convert the sorted array to String by passing it to the constructor of the String array.
Sorting in Perl can be done with the use of a pre-defined function 'sort'. This function uses a quicksort algorithm to sort the array passed to it. Sorting of an array that contains strings in the mixed form i.e. alphanumeric strings can be done in various ways with the use of sort() function.
Perl has two operators that behave this way: <=> for sorting numbers in ascending numeric order, and cmp for sorting strings in ascending alphabetic order. By default, sort uses cmp -style comparisons.
The following is probably the fastest[1]:
sub my_compare($$) {
$_[0] =~ tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r
cmp
$_[1] =~ tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r
}
my @sorted = sort my_compare @unsorted;
Or if you want something more dynamic, the following might be the fastest[2]:
my @syms = split //, 'abdrtwsuiopqe987654';
my @map; $map[ord($syms[$_])] = $_ for 0..$#syms;
sub my_compare($$) {
(pack 'C*', map $map[ord($_)], unpack 'C*', $_[0])
cmp
(pack 'C*', map $map[ord($_)], unpack 'C*', $_[1])
}
my @sorted = sort my_compare @unsorted;
We could compare character by character, but that will be far slower.
use List::Util qw( min );
my @syms = split //, 'abdrtwsuiopqe987654';
my @map; $map[ord($syms[$_])] = $_ for 0..$#syms;
sub my_compare($$) {
my $l0 = length($_[0]);
my $l1 = length($_[1]);
for (0..min($l0, $l1)) {
my $ch0 = $map[ord(substr($_[0], $_, 1))];
my $ch1 = $map[ord(substr($_[1], $_, 1))];
return -1 if $ch0 < $ch1;
return +1 if $ch0 > $ch1;
}
return -1 if $l0 < $l1;
return +1 if $l0 > $l1;
return 0;
}
my @sorted = sort my_compare @unsorted;
Technically, it can be made faster using GRT.
my @sorted =
map /\0(.*)/s,
sort
map { tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r . "\0" . $_ }
@unsorted;
Technically, it can be made faster using GRT.
my @sorted =
map /\0(.*)/s,
sort
map { ( pack 'C*', map $map[ord($_)], unpack 'C*', $_ ) . "\0" . $_ }
@unsorted;
cmp
is implemented by the scmp
operator.
$ perl -MO=Concise,-exec -e'$x cmp $y'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <#> gvsv[*x] s
4 <#> gvsv[*y] s
5 <2> scmp[t3] vK/2
6 <@> leave[1 ref] vKP/REFC
The scmp
operator is implemented by the pp_scmp
function in pp.c
, which is really just a wrapper for sv_cmp_flags
in sv.c
when use locale;
isn't in effect. sv_cmp_flags
either uses C library function memcmp
or a UTF-8 aware version (depending on the type of scalar).
use Sort::Key qw(keysort);
my @sorted = keysort { tr/abdrtwsuiopqe987654/abcdefghijklmnopqrs/r } @data;
Or in older perls not supporting the r
flag in tr/.../.../r
my @sorted = keysort { my $key = $_;
$key =~ tr/abdrtwsuiopqe987654/abcdefghijklmnopqrs/;
$key } @data;
You can also create an specialized sort subroutine for that kind of data as follows:
use Sort::Key::Maker 'my_special_sort',
sub { tr/abdrtwsuiopqe987654/abcdefghijklmnopqrs/r },
qw(string);
my @sorted = my_special_sort @data;
my @sorted2 = my_special_sort @data2;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With