Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match only innermost delimited sequence

I have a string that contains sequences delimited by multiple characters: << and >>. I need a regular expression to only give me the innermost sequences. I have tried lookaheads but they don't seem to work in the way I expect them to.

Here is a test string:

'do not match this <<but match this>> not this <<BUT NOT THIS <<this too>> IT HAS CHILDREN>> <<and <also> this>>'

It should return:

but match this
this too
and <also> this

As you can see with the third result, I can't just use /<<[^>]+>>/ because the string may have one character of the delimiters, but not two in a row.

I'm fresh out of trial-and-error. Seems to me this shouldn't be this complicated.

like image 601
amphetamachine Avatar asked Dec 06 '22 20:12

amphetamachine


1 Answers

@matches = $string =~ /(<<(?:(?!<<|>>).)*>>)/g;

(?:(?!PAT).)* is to patterns as [^CHAR]* is to characters.

like image 151
ikegami Avatar answered Dec 09 '22 10:12

ikegami