Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the technical reason for "lookbehind assertion MUST be fixed length" in regex?

For example,the regex below will cause failure reporting lookbehind assertion is not fixed length:

#(?<!(?:(?:src)|(?:href))=["\']?)((?:https?|ftp)://[^\s\'"<>()]+)#S 

Such kind of restriction doesn't exist for lookahead.

like image 982
wamp Avatar asked Sep 26 '10 02:09

wamp


People also ask

What is Lookbehind in regex?

Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there.

What is Lookbehind assertion?

Regex Lookbehind is used as an assertion in Python regular expressions(re) to determine success or failure whether the pattern is behind i.e to the right of the parser's current position. They don't match anything. Hence, Regex Lookbehind and lookahead are termed as a zero-width assertion.

What are assertions in regex?

What is a Regular Expression Assertion? An Assertion is a Regular Expression that either succeeds (if a match is found) or fails (if a match is not found). They consist of Anchors and Lookarounds.

What is Lookbehind?

Lookbehind is similar, but it looks behind. That is, it allows to match a pattern only if there's something before it.


1 Answers

Lookahead and lookbehind aren't nearly as similar as their names imply. The lookahead expression works exactly the same as it would if it were a standalone regex, except it's anchored at the current match position and it doesn't consume what it matches.

Lookbehind is a whole different story. Starting at the current match position, it steps backward through the text one character at a time, attempting to match its expression at each position. In cases where no match is possible, the lookbehind has to go all the way to the beginning of the text (one character at a time, remember) before it gives up. Compare that to the lookahead expression, which gets applied exactly once.

This is a gross oversimplification, of course, and not all flavors work that way, but you get the idea. The way lookbehinds are applied is fundamentally different from (and much, much less efficient than) the way lookaheads are applied. It only makes sense to put a limit on how far back the lookbehind has to look.

like image 101
Alan Moore Avatar answered Sep 24 '22 00:09

Alan Moore