Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex negative lookbehinds with a wildcard

I'm trying to match some text if it does not have another block of text in its vicinity. For example, I would like to match "bar" if "foo" does not precede it. I can match "bar" if "foo" does not immediately precede it using negative look behind in this regex:

/(?<!foo)bar/

but I also like to not match "foo 12345 bar". I tried:

/(?<!foo.{1,10})bar/

but using a wildcard + a range appears to be an invalid regex in Ruby. Am I thinking about the problem wrong?

like image 501
Kevin Eder Avatar asked Nov 30 '12 19:11

Kevin Eder


1 Answers

You are thinking about it the right way. But unfortunately lookbehinds usually have be of fixed-length. The only major exception to that is .NET's regex engine, which allows repetition quantifiers inside lookbehinds. But since you only need a negative lookbehind and not a lookahead, too. There is a hack for you. Reverse the string, then try to match:

/rab(?!.{0,10}oof)/

Then reverse the result of the match or subtract the matching position from the string's length, if that's what you are after.

Now from the regex you have given, I suppose that this was only a simplified version of what you actually need. Of course, if bar is a complex pattern itself, some more thought needs to go into how to reverse it correctly.

Note that if your pattern required both variable-length lookbehinds and lookaheads, you would have a harder time solving this. Also, in your case, it would be possible to deconstruct your lookbehind into multiple variable length ones (because you use neither + nor *):

/(?<!foo)(?<!foo.)(?<!foo.{2})(?<!foo.{3})(?<!foo.{4})(?<!foo.{5})(?<!foo.{6})(?<!foo.{7})(?<!foo.{8})(?<!foo.{9})(?<!foo.{10})bar/

But that's not all that nice, is it?

like image 58
Martin Ender Avatar answered Nov 05 '22 18:11

Martin Ender