I'm attempting to split
a string into its component characters.
For this purpose I've always used split(//, $str)
as suggested by the documentation:
However, this:
print join(':', split(//, 'abc')), "\n";
uses empty string matches as separators to produce the outputa:b:c
; thus, the empty string may be used to split EXPR into a list of its component characters.
In my script I need an array of the first N characters or the first length($str) - 1
characters, whichever is less. To achieve this, I use split(//, $str, $n + 1)
and discard the last element.
In theory this should work. If LIMIT is less than the string length then all extra characters are grouped into the last element which is discarded. If LIMIT is greater than the string length, the last element is the last character which is discarded.
This is where I run into a bit of a problem.
The documentation says:
...and each of these:
print join(':', split(//, 'abc', 3)), "\n";
print join(':', split(//, 'abc', 4)), "\n";
produces the outputa:b:c
.
But that's not the result I'm getting. If LIMIT is greater than the number of characters, the resulting array always ends with exactly one blank element (demo):
print join(':', split(//, 'abc', 1)), "\n"; # abc
print join(':', split(//, 'abc', 2)), "\n"; # a:bc
print join(':', split(//, 'abc', 3)), "\n"; # a:b:c
print join(':', split(//, 'abc', 4)), "\n"; # a:b:c:
print join(':', split(//, 'abc', 99)), "\n"; # a:b:c:
These results directly contradict example from the documentation.
Is the documentation wrong? Is my version of Perl (v5.22.2) wrong?
If this behavior can't be avoided, how can I accomplish my original goal?
Using split() When the string is empty and no separator is specified, split() returns an array containing one empty string, rather than an empty array. If the string and separator are both empty strings, an empty array is returned.
The split() function returns the strings as a list.
The return type of Split is an Array of type Strings.
split() will return an array.
It appears that the example in the documentation is incorrect. A little further down the documentation is the following:
An empty trailing field, on the other hand, is produced when there is a match at the end of EXPR, regardless of the length of the match (of course, unless a non-zero LIMIT is given explicitly, such fields are removed, as in the last example).
Because I'm providing a non-zero LIMIT, trailing empty fields are retained. The empty pattern //
matches after the final character but before the end of string, so exactly one trailing empty field is produced.
The workarounds proposed in the comments – using a split pattern of (?!$)
or using substr($str, 0, $n)
as an input – both work.
However, instead of forcing split
to cooperate, I've opted to update the "discard the final element" logic from pop(@arr)
to while (@arr && pop(@arr) eq "") { }
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With