Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: how to use lookahead/lookbehind on the result of a pattern?

Tags:

regex

I'm trying to learn more about regex today.

I'm simply trying to match an order number not surrounded by brackets (#1234 but not [#1234]) but my question is more in general about using lookahead assertions on an arbitrary pattern.

On my first attempts I noticed my negative lookahead match \d+(?!\]) would cause the \d+ to keep matching digits until it wasn't followed by a ]. I need the digits to match only if their entirety isn't followed by a ].

My current solution kills the match at the first digit by looking ahead to see if there's a ] in the digit chain.

Is this a standard way to go about this? I'm just repeating the match pattern in the lookahead. If this were a more complex regex, would I approach it the same? Repeat the valid match followed by the invalid match and have the regex engine repeat itself for every letter?

For valid matches, it would have to match itself as many times as the characters in the match.

(?<!\[) # not preceded by [
#\d+ 
(?!\d*\]) # not followed zero+ digits and ] 

# or (?!\d|\]) # not followed by digit or ]

I'd appreciate any feedback!

like image 507
Yuji 'Tomita' Tomita Avatar asked Nov 18 '11 05:11

Yuji 'Tomita' Tomita


1 Answers

You can achieve what you want by using a possessive quantifier along with lookarounds like this

(?<!\[)#\d++(?!\])

The problem in your case is when you use \d+ it allows backtracking and ends up having a partial match #123. Once you change that to possessive quantifier, it will not backtrack and only match if the sequence of digits is not preceded/followed by brackets.

Live Demo

Edit If possessive quantifiers are not supported then you can use this one

#\d(?<!\[#\d)(?!\d*\])\d*
like image 120
Narendra Yadala Avatar answered Nov 08 '22 15:11

Narendra Yadala