Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to simulate non-greedy quantifiers in languages that don't support them?

Tags:

regex

Consider this regex: <(.*)>

Applied against this string:

<2356> <my pal ned> <!@%@>

Obviously, it will match the entire string because of the greedy *. The best solution would be to use a non-greedy quantifier, like *?. However, many languages and editors don't support these.

For simple cases like the above, I've gotten around this limitation with a regex like this: <([^>]*)>

But what could be done with a regex like this? start (.*) end

Applied against this string:

start 2356 end start my pal ned end start !@%@ end

Is there any recourse at all?

like image 403
Ipsquiggle Avatar asked Jan 15 '10 21:01

Ipsquiggle


2 Answers

If the end condition is the presence of a single character you can use a negative character class instead:

<([^>]*)>

For more complexes cases where the end condition is multiple characters you could try a negative lookahead, but if lazy matching is not supported the chances are that lookaheads won't be either:

((?!end).)*

Your last recourse is to construct something horrible like this:

(en[^d]|e[^n]|[^e])*
like image 89
Mark Byers Avatar answered Oct 08 '22 05:10

Mark Byers


I replace . with [^>] where > in this case is the next character in the RE.

like image 20
Mark Ransom Avatar answered Oct 08 '22 04:10

Mark Ransom