Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex negative lookbehind on string

Tags:

regex

php

pcre

I cannot seem to be able to find a way to not return a match if a string exists but not immediately before another string.

I am able to not return a match if a string exists immediately before another string, with the following.

$string = 'Stackoverflow hello world foobar test php';

$regex = "~(Stackoverflow).*?(?<!(test\s))(php)~i";

if(preg_match_all($regex,$string,$match))
    print_r($match);

In this example, we want to return a match if we have the word Stackoverflow and php but only if the word test(with a space character) does not exist before the word php.

This doesn't return any result which is good.

Lets now say I want to match php but only if the word foobar doesn't exist somewhere between Stackoverflow and php, I assumed I could do the following.

$string = 'Stackoverflow hello world foobar test php';

$regex = "~(Stackoverflow).*?(?<!(foobar)).*?(php)~i";

if(preg_match_all($regex,$string,$match))
    print_r($match);

(I have changed the negative look behind string to (foobar), and added .*? after)

I would also like to say that I cannot always know what words will exist between foobar and php, sometimes there will be none, sometimes 200, but I do have some positioning information (after Stackoverflow and before php).

like image 349
cecilli0n Avatar asked Mar 14 '14 00:03

cecilli0n


People also ask

What is a negative Lookbehind regex?

A negative lookbehind assertion asserts true if the pattern inside the lookbehind is not matched. Here is its syntax: (?<!...) For example, (? <! xyz)abc asserts that there cannot be the string, xyz , just before matching the string, abc .

Can I use negative Lookbehind?

Positive and Negative Lookbehind Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there. (? <!a)b matches a “b” that is not preceded by an “a”, using negative lookbehind.

What is a Lookbehind in regex?

Lookbehind, which is used to match a phrase that is preceded by a user specified text. Positive lookbehind is syntaxed like (? <=a)something which can be used along with any regex parameter. The above phrase matches any "something" word that is preceded by an "a" word. Negative Lookbehind is syntaxed like (?

What is positive and negative lookahead in regex?

Positive lookahead: (?= «pattern») matches if pattern matches what comes after the current location in the input string. Negative lookahead: (?! «pattern») matches if pattern does not match what comes after the current location in the input string.


1 Answers

I would use a negative lookahead to ensure the string 'foobar.*php' does not exist after 'stackoverflow' And since you wanted to capture php, I'd put that into a capturing group. Something like:

Stackoverflow(?:(?!foobar.*php).)*(php)

Note that this results in checking after each character

like image 107
Ron Rosenfeld Avatar answered Oct 18 '22 00:10

Ron Rosenfeld