Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

perl split strange behavior

Tags:

regex

split

perl

I apologize in advance, this is probably a very stupid question with an obvious solution which is escaping the eye of a rather beginner in perl, or it may also have been in Stackoverflow as a solved question, but my lack of knowledge about what exactly to look for is preventing me from actually finding the answer.

I have a string like:

$s = FOO: < single blankspace> BAR <some whitespace character> some more text with     whitespace that can span over multiple lines, i.e. has \n in them ; 

#please excuse the lack of quotes, and large text describing the character in angular brackets, but in this example, but I have the string correctly defined, and in plase of <blankspace> I have the actual ASCII 32 character etc.

Now I want to split the $s, in this way:

($instType, $inst, $trailing) = split(/\s*/, $s, 3);
#please note that i do not use the my keyword as it is not in a subroutine
#but i tested with my, it does not change the behavior

I would expect, that $instType takes the value FOO: , without any surrounding space, in the actual test string there is a colon, and I believe, to the best of my knowledge, that it will remain in the $instType. Then it is rather obvious to expect that $inst takes similary the value BAR , without any surrounding spaces, and then finally one may also lean on $trail to take the rest of the string.

However, I am getting: $instType takes F , that is just the single char, $inst takes O, the single charater in the 2nd position in the string $trail takes O: BAR and the rest.

How do I address the issue?

PS perl is 5.18.0

like image 846
Sean Avatar asked Dec 22 '13 19:12

Sean


2 Answers

the problem is the quantifier * that allows zero space (zero or more), you must use + instead, that means 1 or more.

Note that there is exactly zero space between F and O.

like image 65
Casimir et Hippolyte Avatar answered Oct 16 '22 09:10

Casimir et Hippolyte


You wrote:

#please note that i do not use the my keyword as it is not in a subroutine
#but i tested with my, it does not change the behavior

You can, and should, use my outside of subroutines, too. Using that in conjunction with use strict prevents silly errors like this:

$some_field = 'bar';
if ( $some_feild ) { ... }

If those statements were separated, it could be awfully hard to track down that bug.

like image 1
Ovid Avatar answered Oct 16 '22 08:10

Ovid