I was noticing some curious behavior with Perl's split command, particularly in cases when I would expect the resulting array to contain empty strings '', but it actually doesn't.
For example, if I have a delimiter(s) at the end (or the beginning) of the string , the resulting array does not have an empty string(s) '' as the last (or first) element.
Example:
@s = split(/x/, 'axb')
produces 2 element array ['a','b']
@s = split(/x/, 'axbx')
produces same array
@s = split(/x/, 'axbxxxx')
produces same array
But as soon as I put something at the end, all those empty strings do appear as elements:
@s = split(/x/, 'axbxxxxc')
produces a 6 element array ['a','b','','','','c']
Behavior is similar if the delimiters are at the beginning.
I would expect empty text between, before, or after delimiters to always produce elements in the split. Can anyone explain to me why the split behaves like this in Perl? I just tried the same thing in Python and it worked as expected.
Note: Perl v5.8
A string is splitted based on delimiter specified by pattern. By default, it whitespace is assumed as delimiter. split syntax is: Split /pattern/, variableName.
split() is a string function in Perl which is used to split or you can say to cut a string into smaller sections or pieces. There are different criteria to split a string, like on a single character, a regular expression(pattern), a group of characters or on undefined value etc..
From the documentation:
By default, empty leading fields are preserved, and empty trailing ones are deleted. (If all fields are empty, they are considered to be trailing.)
That explains the behavior you're seeing with trailing fields. This generally makes sense, since people are often very careless about trailing whitespace, for example. However, you can get the trailing blank fields if you want:
split /PATTERN/,EXPR,LIMIT
If LIMIT is negative, it is treated as if an arbitrarily large LIMIT had been specified.
So to get all trailing empty fields:
@s = split(/x/, 'axbxxxxc', -1);
(I'm assuming you made a careless mistake when looking at leading empty fields - they definitely are preserved. Try split(/x/, 'xaxbxxxx')
. The result has size 3.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With