Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the precedence rules for Perl regular expressions?

Tags:

regex

perl

I can find't an official reference for the precedence rules for Perl regular expressions. What I can find is only Know the precedence of regular expression operators. However, it's not an official reference given by perldoc.

like image 699
Jingguo Yao Avatar asked Mar 07 '16 06:03

Jingguo Yao


People also ask

What is the precedence of regular expression?

In regular expressions, the ' * ', ' + ', and ' ? ' operators, as well as the braces ' { ' and ' } ', have the highest precedence, followed by concatenation, and finally by ' | '. As in arithmetic, parentheses can change how operators are grouped.

Which of the following regex has highest precedence?

The regular-expression operator star has the highest precedence and is left associative. The regular-expression operator concatenation has the next highest precedence and is left associative.

What is Perl style regular expression?

Regular Expression (Regex or Regexp or RE) in Perl is a special text string for describing a search pattern within a given text. Regex in Perl is linked to the host language and is not the same as in PHP, Python, etc. Sometimes it is termed as “Perl 5 Compatible Regular Expressions“.


1 Answers

Regular expressions only have two binary operators, one of which is implicit rather than represented by a symbol. Regular expressions also have a number of unary operators, but their precedence is moot due to the restrictions on their operands. That makes talking about precedence really odd.

It's simpler conveying the information you seek using the following statements:

  • Quantifiers modify a single atom.
  • Quantifiers modifiers modify a single quantifier.
  • Alternation is unbounded except by the parens in which they reside.

The above information is conveyed one way or another in perlretut.


That said, it is possible to build a precedence table. Since the above statements convey all the information you need, it's possible to build the precedence table from them. It is the following:

  1. Atoms (e.g. a, \n, \^, ., ^, \w, [...], \1, (...))
  2. Postfix unary operators (quantifiers and quantifier modifiers)
  3. Implicit "followed by" operator between (possibly-quantified) atoms
  4. Alternation

This matches the chart in the page to which you linked.


For fun, the following would be the BNF:

pattern              ::= <alternation>

alternation          ::= <sequence> <alternation2>
alternation2         ::= "|" <alternation> | ""

sequence             ::= <quantified_atom> <sequence> | ""

quantified_atom      ::= <atom> <quantified_atom2>
quantified_atom2     ::= <modified_quantifier> | ""
modified_quantifier  ::= <quantifier> <modified_quantifier2>
modified_quantifier2 ::= <quantifier_modifier> | ""
like image 68
ikegami Avatar answered Sep 18 '22 17:09

ikegami