Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Variable-length lookbehind-assertion alternatives for regular expressions

Is there an implementation of regular expressions in Python/PHP/JavaScript that supports variable-length lookbehind-assertion?

/(?<!foo.*)bar/ 

How can I write a regular expression that has the same meaning, but uses no lookbehind-assertion?

Is there a chance that this type of assertion will be implemented some day?

Things are much better that I thought.

Update:

(1) There are regular expressions implementation that support variable-length lookbehind-assertion already.

Python module regex (not standard re, but additional regex module) supports such assertions (and has many other cool features).

>>> import regex >>> m = regex.search('(?<!foo.*)bar', 'f00bar') >>> print m.group() bar >>> m = regex.search('(?<!foo.*)bar', 'foobar') >>> print m None 

It was a really big surprise for me that there is something in regular expressions that Perl can't do and Python can. Probably, there is "enhanced regular expression" implementation for Perl also?

(Thanks and +1 to MRAB).

(2) There is a cool feature \K in modern regular expressions.

This symbols means that when you make a substitution (and from my point of view the most interesting use case of assertions is the substitution), all characters that were found before \K must not be changed.

s/unchanged-part\Kchanged-part/new-part/x 

That is almost like a look-behind assertion, but not so flexible of course.

More about \K:

  • Perl Regular Expression \K Trick
  • PCRE Regex Spotlight: \K

As far as I understand, you can't use \K twice in the same regular expression. And you can't say till which point you want to "kill" the characters that you've found. That is always till the beginning of the line.

(Thanks and +1 to ikegami).

My additional questions:

  • Is it possible to say what point must be the final point of \K effect?
  • What about enhanced regular expressions implementations for Perl/Ruby/JavaScript/PHP? Something like regex for Python.
like image 997
Igor Chubin Avatar asked Jul 24 '12 22:07

Igor Chubin


People also ask

Can I use Lookbehind regex?

The good news is that you can use lookbehind anywhere in the regex, not only at the start. If you want to find a word not ending with an “s”, you could use \b\w+(? <! s)\b.

What is Lookbehind in regex?

Introduction to the JavaScript regex lookbehind In regular expressions, a lookbehind matches an element if there is another specific element before it. A lookbehind has the following syntax: (?<=Y)X. In this syntax, the pattern match X if there is Y before it.

What is lookahead assertion in regex?

A lookahead assertion has the form (?= test) and can appear anywhere in a regular expression. MATLAB® looks ahead of the current location in the text for the test condition. If MATLAB matches the test condition, it continues processing the rest of the expression to find a match.

What is Lookbehind assertion?

Regex Lookbehind is used as an assertion in Python regular expressions(re) to determine success or failure whether the pattern is behind i.e to the right of the parser's current position. They don't match anything. Hence, Regex Lookbehind and lookahead are termed as a zero-width assertion.


2 Answers

Most of the time, you can avoid variable length lookbehinds by using \K.

s/(?<=foo.*)bar/moo/s; 

would be

s/foo.*\Kbar/moo/s; 

Anything up to the last \K encountered is not considered part of the match (e.g. for the purposes of replacement, $&, etc)

Negative lookbehinds are a little trickier.

s/(?<!foo.*)bar/moo/s; 

would be

s/^(?:(?!foo).)*\Kbar/moo/s; 

because (?:(?!STRING).)* is to STRING as [^CHAR]* is to CHAR.


If you're just matching, you might not even need the \K.

/foo.*bar/s  /^(?:(?!foo).)*bar/s 
like image 143
ikegami Avatar answered Oct 05 '22 13:10

ikegami


For Python there's a regex implementation which supports variable-length lookbehinds:

http://pypi.python.org/pypi/regex

It's designed to be backwards-compatible with the standard re module.

like image 22
MRAB Avatar answered Oct 05 '22 13:10

MRAB