Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does this Perl regex mean: m/(.*?):(.*?)$/g?

Tags:

regex

perl

I am editing a Perl file, but I don't understand this regexp comparison. Can someone please explain it to me?

if ($lines =~ m/(.*?):(.*?)$/g) { } .. 

What happens here? $lines is a line from a text file.

like image 234
perlnewb Avatar asked Sep 22 '10 16:09

perlnewb


3 Answers

Break it up into parts:

$lines =~ m/ (.*?)      # Match any character (except newlines)
                        # zero or more times, not greedily, and
                        # stick the results in $1.
             :          # Match a colon.
             (.*?)      # Match any character (except newlines)
                        # zero or more times, not greedily, and
                        # stick the results in $2.
             $          # Match the end of the line.
           /gx;

So, this will match strings like ":" (it matches zero characters, then a colon, then zero characters before the end of the line, $1 and $2 are empty strings), or "abc:" ($1 = "abc", $2 is an empty string), or "abc:def:ghi" ($1 = "abc" and $2 = "def:ghi").

And if you pass in a line that doesn't match (it looks like this would be if the string does not contain a colon), then it won't process the code that's within the brackets. But if it does match, then the code within the brackets can use and process the special $1 and $2 variables (at least, until the next regular expression shows up, if there is one within the brackets).

like image 172
CanSpice Avatar answered Sep 22 '22 11:09

CanSpice


There is a tool to help understand regexes: YAPE::Regex::Explain.

Ignoring the g modifier, which is not needed here:

use strict;
use warnings;
use YAPE::Regex::Explain;

my $re = qr/(.*?):(.*?)$/;
print YAPE::Regex::Explain->new($re)->explain();

__END__

The regular expression:

(?-imsx:(.*?):(.*?)$)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  :                        ':'
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

See also perldoc perlre.

like image 21
toolic Avatar answered Sep 21 '22 11:09

toolic


It was written by someone who either knows too much about regular expressions or not enough about the $' and $` variables.

THis could have been written as

if ($lines =~ /:/) {
    ... # use $` ($PREMATCH)  instead of $1
    ... # use $' ($POSTMATCH) instead of $2
}

or

if ( ($var1,$var2) = split /:/, $lines, 2 and defined($var2) ) {
    ... # use $var1, $var2 instead of $1,$2
}
like image 27
mob Avatar answered Sep 25 '22 11:09

mob