How do i nicely/idiomatically split a string at a list of positions?
What I have:
.say for split-at( "0019ABX26002", (3, 4, 8) );
sub split-at( $s, @positions )
{
my $done = 0;
gather
{
for @positions -> $p
{
take $s.substr($done, $p - $done );
$done = $p;
}
take $s.substr( $done, * );
}
}
which is reasonable. I am puzzled by the lack of language support for this though. If "split on" is a thing, why isn't "split at" too? I think this should be a core operation. I should be able to write
.say for "0019ABX26002".split( :at(3, 4, 8) );
Or maybe I am overlooking something?
Edit: A little Benchmark of what we have so far
O------------O---------O------------O--------O-------O-------O
| | Rate | array-push | holli | raiph | simon |
O============O=========O============O========O=======O=======O
| array-push | 15907/s | -- | -59% | -100% | -91% |
| holli | 9858/s | 142% | -- | -100% | -79% |
| raiph | 72.8/s | 50185% | 20720% | -- | 4335% |
| simon | 2901/s | 1034% | 369% | -98% | -- |
O------------O---------O------------O--------O-------O-------O
Code:
use Bench;
my $s = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbccccddddddddddddddddddddddddddddddddddddefggggggggggggggggggg";
my @p = 29, 65, 69, 105, 106, 107;
Bench.new.cmpthese(1000, {
holli => sub { my @ = holli($s, @p); },
simon => sub { my @ = simon($s, @p); },
raiph => sub { my @ = raiph($s, @p); },
array-push => sub { my @ = array-push($s, @p); },
});
#say user($s, @p);
sub simon($str, *@idxs ) {
my @rotors = @idxs.map( { state $l = 0; my $o = $_ - $l; $l = $_; $o } );
$str.comb("").rotor( |@rotors,* ).map(*.join(""));
}
sub raiph($s, @p) {
$s.split( / <?{$/.pos == any(@p)}> / )
}
sub holli( $s, @positions )
{
my $done = 0;
gather
{
for @positions -> $p
{
take $s.substr($done, $p - $done );
$done = $p;
}
take $s.substr( $done, * );
}
}
sub array-push( $s, @positions )
{
my $done = 0;
my @result;
for @positions -> $p
{
@result.push: $s.substr($done, $p - $done );
$done = $p;
}
@result.push: $s.substr( $done, * );
@result;
}
To split a string at a specific index, use the slice method to get the two parts of the string, e.g. str. slice(0, index) returns the part of the string up to, but not including the provided index, and str. slice(index) returns the remainder of the string.
Python String split() Method Syntax separator : This is a delimiter. The string splits at this specified separator. If is not provided then any white space is a separator. maxsplit : It is a number, which tells us to split the string into maximum of provided number of times.
Personally I'd split it into a list, use rotor
to divide the list up and join the result :
"0019ABX26002".comb().rotor(3,1,4,*).map(*.join)
If you want a split at function (using the indexes given) :
sub split-at( $str, *@idxs ) {
my @rotors = @idxs.map( { state $l = 0; my $o = $_ - $l; $l = $_; $o } );
$str.comb("").rotor( |@rotors,* ).map(*.join(""));
}
Basically if I want to do list type stuff I use a list.
I came up with another version that I really like from a functional programming sense :
sub split-at( $str, *@idxs ) {
(|@idxs, $str.codes)
==> map( { state $s = 0;my $e = $_ - $s;my $o = [$s,$e]; $s = $_; $o } )
==> map( { $str.substr(|$_) } );
}
It works out to be slightly slower than the other one.
One way:
.say for "0019ABX26002" .split: / <?{ $/.pos ∈ (3,4,8) }> /
displays:
001
9
ABX2
6002
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With