Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does my Perl regex complain about "Unmatched ) in regex"?

Tags:

regex

perl

if($title =~ s/(\s|^|,|\/|;|\|)$replace(\s|$|,|\/|;|\|)//ig)

$title can be a set of titles ranging from President, MD, COO, CEO,...

$replace can be (shareholder), (Owner) or the like.

I keep getting this error. I have checked for improperly balanced '(', ')', no dice :(

Unmatched ) in regex; marked by <-- HERE in m/(\s|^|,|/|;|\|)Owner) <-- HERE (\s|$|,|/|;|\|)/

If you could tell me what the regex does, that would be awesome. Does it strip those symbols? Thanks guys!

like image 789
ThinkCode Avatar asked Mar 16 '10 22:03

ThinkCode


3 Answers

If the variable $replace can contain regex meta characters you should wrap it in \Q...\E

\Q$replace\E

To quote Jeffrey Friedl's Mastering Regular Expressions

Literal Text Span The sequence \Q "Quotes" regex metacharacters (i.e., puts a backslash in front of them) until the end of the string, or until a \E sequence.

like image 80
Paul Creasey Avatar answered Sep 23 '22 13:09

Paul Creasey


As mentioned, it'll strip those punctuation symbols, followed by the contents of $replace, then more punctuation symbols, and that it's failing because $replace itself contains a mismatched parenthesis.

However, a few other general regex things: first, instead of ORing everything together (and this is just to simplify logic and typing) I'd keep them together in a character class. matching [\s^,\/;\|] is potentially less error-prone and finger friendly.

Second, don't use grouping parenthesis a set of () unless you really mean it. This places the captured string in capture buffers, and incurs overhead in the regex engine. Per perldoc perlre:

WARNING: Once Perl sees that you need one of $& , $` , or $' anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce $1, $2, etc, so you also pay a price for each pattern that contains capturing parentheses. Source

You can easily get around this by just changing it by adding ?: to the parenthesis:

(?:[\s^,\/;\|])

Edit: not that you need non-capturing grouping in that instance, but it's already in the original regex.

like image 25
Marc Bollinger Avatar answered Sep 21 '22 13:09

Marc Bollinger


It appears that your variable $replace contains the string Owner), not (Owner).


$title = "Foo Owner Bar";
$replace = "Owner)";
if($title =~ s/(\s|^|,|\/|;|\|)$replace(\s|$|,|\/|;|\|)//ig) {
    print $title;
}

Output:

Unmatched ) in regex; marked by <-- HERE in m/(\s|^|,|/|;|\|)Owner)<-- HERE (\s
|$|,|/|;|\|)/ at test.pl line 3.

$title = "Foo Owner Bar";
$replace = "(Owner)";
if($title =~ s/(\s|^|,|\/|;|\|)$replace(\s|$|,|\/|;|\|)//ig) {
    print $title;
}

Output:

FooBar
like image 22
Mark Byers Avatar answered Sep 23 '22 13:09

Mark Byers