Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl split with empty text before/after delimiters

Tags:

perl

I was noticing some curious behavior with Perl's split command, particularly in cases when I would expect the resulting array to contain empty strings '', but it actually doesn't.

For example, if I have a delimiter(s) at the end (or the beginning) of the string , the resulting array does not have an empty string(s) '' as the last (or first) element.

Example:

@s = split(/x/, 'axb') 

produces 2 element array ['a','b']

@s = split(/x/, 'axbx') 

produces same array

@s = split(/x/, 'axbxxxx') 

produces same array

But as soon as I put something at the end, all those empty strings do appear as elements:

@s = split(/x/, 'axbxxxxc') 

produces a 6 element array ['a','b','','','','c']

Behavior is similar if the delimiters are at the beginning.

I would expect empty text between, before, or after delimiters to always produce elements in the split. Can anyone explain to me why the split behaves like this in Perl? I just tried the same thing in Python and it worked as expected.

Note: Perl v5.8

like image 904
Roman Avatar asked Sep 14 '10 18:09

Roman


People also ask

How do I split a string with multiple delimiters in Perl?

A string is splitted based on delimiter specified by pattern. By default, it whitespace is assumed as delimiter. split syntax is: Split /pattern/, variableName.

How do I split a string in Perl?

split() is a string function in Perl which is used to split or you can say to cut a string into smaller sections or pieces. There are different criteria to split a string, like on a single character, a regular expression(pattern), a group of characters or on undefined value etc..


1 Answers

From the documentation:

By default, empty leading fields are preserved, and empty trailing ones are deleted. (If all fields are empty, they are considered to be trailing.)

That explains the behavior you're seeing with trailing fields. This generally makes sense, since people are often very careless about trailing whitespace, for example. However, you can get the trailing blank fields if you want:

split /PATTERN/,EXPR,LIMIT

If LIMIT is negative, it is treated as if an arbitrarily large LIMIT had been specified.

So to get all trailing empty fields:

@s = split(/x/, 'axbxxxxc', -1); 

(I'm assuming you made a careless mistake when looking at leading empty fields - they definitely are preserved. Try split(/x/, 'xaxbxxxx'). The result has size 3.)

like image 64
Cascabel Avatar answered Sep 29 '22 19:09

Cascabel