I'm trying to sort lists of names with Perl with a specific letter order to perform some special features.
The sorting would be working the same way as sort { $a cmp $b } but with a different succession of letters.
For example, ordering with the arbitrary character order "abdrtwsuiopqe987654" ...
I tried to deal with sort { $a myFunction $b } but I'm newbie with Perl and I don't see how to organize correctly myFunction to get what I want.
cmp function implemented with Perl to see how it works ?Using the toCharArray() method Get the required string. Convert the given string to a character array using the toCharArray() method. Sort the obtained array using the sort() method of the Arrays class. Convert the sorted array to String by passing it to the constructor of the String array.
Sorting in Perl can be done with the use of a pre-defined function 'sort'. This function uses a quicksort algorithm to sort the array passed to it. Sorting of an array that contains strings in the mixed form i.e. alphanumeric strings can be done in various ways with the use of sort() function.
Perl has two operators that behave this way: <=> for sorting numbers in ascending numeric order, and cmp for sorting strings in ascending alphabetic order. By default, sort uses cmp -style comparisons.
The following is probably the fastest[1]:
sub my_compare($$) {
$_[0] =~ tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r
cmp
$_[1] =~ tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r
}
my @sorted = sort my_compare @unsorted;
Or if you want something more dynamic, the following might be the fastest[2]:
my @syms = split //, 'abdrtwsuiopqe987654';
my @map; $map[ord($syms[$_])] = $_ for 0..$#syms;
sub my_compare($$) {
(pack 'C*', map $map[ord($_)], unpack 'C*', $_[0])
cmp
(pack 'C*', map $map[ord($_)], unpack 'C*', $_[1])
}
my @sorted = sort my_compare @unsorted;
We could compare character by character, but that will be far slower.
use List::Util qw( min );
my @syms = split //, 'abdrtwsuiopqe987654';
my @map; $map[ord($syms[$_])] = $_ for 0..$#syms;
sub my_compare($$) {
my $l0 = length($_[0]);
my $l1 = length($_[1]);
for (0..min($l0, $l1)) {
my $ch0 = $map[ord(substr($_[0], $_, 1))];
my $ch1 = $map[ord(substr($_[1], $_, 1))];
return -1 if $ch0 < $ch1;
return +1 if $ch0 > $ch1;
}
return -1 if $l0 < $l1;
return +1 if $l0 > $l1;
return 0;
}
my @sorted = sort my_compare @unsorted;
Technically, it can be made faster using GRT.
my @sorted =
map /\0(.*)/s,
sort
map { tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r . "\0" . $_ }
@unsorted;
Technically, it can be made faster using GRT.
my @sorted =
map /\0(.*)/s,
sort
map { ( pack 'C*', map $map[ord($_)], unpack 'C*', $_ ) . "\0" . $_ }
@unsorted;
cmp is implemented by the scmp operator.
$ perl -MO=Concise,-exec -e'$x cmp $y'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <#> gvsv[*x] s
4 <#> gvsv[*y] s
5 <2> scmp[t3] vK/2
6 <@> leave[1 ref] vKP/REFC
The scmp operator is implemented by the pp_scmp function in pp.c, which is really just a wrapper for sv_cmp_flags in sv.c when use locale; isn't in effect. sv_cmp_flags either uses C library function memcmp or a UTF-8 aware version (depending on the type of scalar).
use Sort::Key qw(keysort);
my @sorted = keysort { tr/abdrtwsuiopqe987654/abcdefghijklmnopqrs/r } @data;
Or in older perls not supporting the r flag in tr/.../.../r
my @sorted = keysort { my $key = $_;
$key =~ tr/abdrtwsuiopqe987654/abcdefghijklmnopqrs/;
$key } @data;
You can also create an specialized sort subroutine for that kind of data as follows:
use Sort::Key::Maker 'my_special_sort',
sub { tr/abdrtwsuiopqe987654/abcdefghijklmnopqrs/r },
qw(string);
my @sorted = my_special_sort @data;
my @sorted2 = my_special_sort @data2;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With