Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match Strings with Variable Left/Right Delimiters

The problem is quite easy. I would like to match anything between some strings at the beginning and some strings at the end. Strings at the end should match appropriate strings at the beginning.

Let's assume that I want to match everything that is between [ and ] or { and }.

The first regular expression that could be used is:

/[{\[](.*)[}\]]/gmU

however there is one problem with it. When subject is:

{aa} werirweiu [ab] wrewre [ac}

also [ac} is matched but it shouldn't.

It can be easily changed into:

/\[(.*)\]|\{(.*)\}/gmU

and the problem is solved.

But what in case if (.*) were much more complicated and beginnings and ends would be for example 10 and they also would be a bit more complicated (not one character but many)? Then using above rule the whole (.*) should be repeated 10 times and it would be illegible.

Is there any way to match ends with beginnings? For example I would like to use syntax similar to

/(aa|bb)(.*)(cc|ddd)/gmU to tell that match must begin with aa and ends with cc or begin with bb and ends with ddd and match in subject aaxx1cc bbxx2ddd aaxx3ddd bbxx4cc only strings xx1 and xx2 without repeating (.*) many times in that regular expression and remembering there might be more than 2 as in above examples beginnings and endings.

like image 873
Marcin Nabiałek Avatar asked Jan 10 '23 03:01

Marcin Nabiałek


1 Answers

Use a Conditional

In my view, this is a very nice place to use conditionals. This regex will work:

(?:(\[)|({)).*?(?(1)\])(?(2)})

See what matches and fails in the Regex Demo.

Other Kinds of Delimiters

This is easy to expand: for instance, the following pattern will match strings delimited between START and END, or between <-- and -->, or between ==: and :==

(?:(START)|(<--)|(==:)).*?(?(1)END)(?(2)-->)(?(3):==)

See the Regex Demo.

Explanation

  • The non-capture group (?:(\[)|({)) matches the opening delimiter, i.e. either
  • [ which (\[) captures to Group 1
  • OR |
  • { which ({) captures to Group 2
  • .*? lazily matches up to a point where...
  • (?(1)\]) if Group 1 is set, we match ]
  • (?(2)}) if Group 2 is set, we match }

Reference

  • Conditional Regex 101
  • If-Then-Else Conditionals in Regular Expressions
like image 56
zx81 Avatar answered Jan 19 '23 00:01

zx81