Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl split pattern

Tags:

regex

perl

According to the perldoc, the syntax for split is:

split /PATTERN/,EXPR,LIMIT

But the PATTERN can also be a single- or double-quoted string: split "PATTERN", EXPR. What difference does it make?

Edit: A difference I'm aware of is splitting on backslashes: split /\\/ vs split '\\'. The second form doesn't work.

like image 773
planetp Avatar asked Jan 07 '11 21:01

planetp


2 Answers

It looks like it uses that as "an expression to specify patterns":

The pattern /PATTERN/ may be replaced with an expression to specify patterns that vary at runtime. (To do runtime compilation only once, use /$variable/o .)

edit: I tested it with this:

my $foo = 'a:b:c,d,e';
print join(' ', split("[:,]", $foo)), "\n";
print join(' ', split(/[:,]/, $foo)), "\n";
print join(' ', split(/\Q[:,]\E/, $foo)), "\n";

Except for the ' ' special case, it looks just like a regular expression.

like image 107
Jim Davis Avatar answered Sep 23 '22 15:09

Jim Davis


PATTERN is always interpreted as... well, a pattern -- never as a literal value. It can be either a regex1 or a string. Strings are compiled to regexes. For the most part the behavior is the same, but there can be subtle differences caused by the double interpretation.

The string '\\' only contains a single backslash. When interpreted as a pattern, it's as if you had written /\/, which is invalid:

C:\>perl -e "print join ':', split '\\', 'a\b\c'"
Trailing \ in regex m/\/ at -e line 1.

Oops!

Additionally, there are two special cases:

  • The empty pattern //, which splits on the empty string.
  • A single space ' ', which splits on whitespace after first trimming any leading or trailing whitespace.

1. Regexes can be supplied either inline /.../ or via a precompiled qr// quoted string.

like image 24
Michael Carman Avatar answered Sep 22 '22 15:09

Michael Carman