Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

unexpected non-greedy JS regular expression result

Why does

/<.+?> e/.exec("a <b> c <d> e")

(unexpectedly) return

["<b> c <d> e"]

instead of

["<d> e"]

The non-greedy operator seems to be doing nothing...

like image 565
Hans Avatar asked Aug 21 '13 12:08

Hans


2 Answers

This can make you understand the role of the lazy operator:

/<.+?> e/.exec("a <b> c <d> e <f> e")` // -> ["<b> c <d> e", "<f> e"]
/<.+> e/.exec("a <b> c <d> e <f> e")`  // -> ["<b> c <d> e <f> e"]

<.+?> e means: once a < is found, find the first > e

<.+> e means: once a < is found, find the last > e

In your specific case, you could simply use <[^>]+> e (which is even better since quicklier - when its possible, always prefer the X[^X]X notation rather than the X.*?X one).

like image 58
sp00m Avatar answered Sep 20 '22 12:09

sp00m


"<b> c <d> e" is a totally valid result. Your regexp says "match < then something, then > e" - this is exactly what you're getting. "Intuitively" "<d> e" might look like a better match, however, a regex engine has no intuition, it just finds the first substring that matches and stops there.

Greediness comes into play when you have a choice between two or more matches - this is not the case here, because there's only one match. If your string had two > e, there would be a difference:

/<.+> e/.exec("a <b> c <d> e more > e")
> ["<b> c <d> e more > e"]
/<.+?> e/.exec("a <b> c <d> e more > e")
> ["<b> c <d> e"]
like image 34
georg Avatar answered Sep 20 '22 12:09

georg