Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegExp: If-Clause for capturing group possible?

tl;dr:

I am searching for a way to match the closing character sequence based upon the style of the opening sequence syntax in PHP with PCRE-style regular expressions.

The task

I am writing a module to capture all translatable strings from written PHP code. One responsibility of this module will be to also capture any translation context stated within the code. This context is provided as part of an options array.

In PHP (afair starting with version 5.4), there are two different styles possible to define an array:
a) array(...)
b) [...]

I now want to write a regular expression that is able to recognize both styles. The pattern should be able to correctly match the ending character sequence depending on the style chosen to start the array.

Unfortunately, I was not able to find any documentation on how to apply the IF-statement to a given capturing group.

In theory it should look something like this:
/ ... (array\(|\[) ... (?(?=\1==\[)\]|\)) ... /
(Note: "..." in the line above should indicate that the regex pattern is longer than stated here. This should only serve as an example for what I am trying to achieve)

The (?(?=\1==\[)\]|\))part translated to "normal language": If the contents of the first capturing group is an opening square bracket, then the pattern should capture a closing square bracket, otherwise a closing round bracket is required.

Is it possible to achieve something like this? Any help is greatly appreciated!

Thanks in advance
Chris

like image 359
Chris Avatar asked Dec 28 '25 04:12

Chris


2 Answers

The regex answer is

(?:array(\()|\[).*?(?(1)\)|])

See the regex demo

Details

  • (?:array(\()|\[) - a non-capturing group matching either array( while capturing ( into Group 1, or [ char
  • .*? - any 0 or more chars other than line break chars as few as possible
  • (?(1)\)|]) - a conditional construct: if Group 1 is matched (the ( char is in the group memory buffer) the ) must match at the current position, else ].
like image 133
Wiktor Stribiżew Avatar answered Dec 30 '25 23:12

Wiktor Stribiżew


If you want to capture the values using the same capturing group, you could also use a branch reset group (?| to refer to group 1 for the value.

To get the values between the opening and closing parenthesis or square brackets, you could use a negated character class [^ to match any char except the listed in the character class.

(?|array(\([^()]*\))|(\[[^][]*]))

Explanation

  • (?| Branch reset group
    • array match literally
    • ( Capture group 1
      • \([^()]*\) Match (...)
    • ) Close group 1
    • | Or
    • ( Capture group 2
      • \[[^][]*] Match [...]
    • ) Close group 2
  • ) close branch reset group

Regex demo

like image 43
The fourth bird Avatar answered Dec 30 '25 23:12

The fourth bird



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!