Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find everything BUT certain phrases with a regular expression?

Ok, so I have a phrase "foo bar" and I want to find everything BUT "foo bar".
Here's my text.

ipsum dolor foo bar Lorem ipsum dolor sit amet,
consectetur adipisicing elit, sed do
eiusmod tempor foo bar incididunt ut labore et
dolore foo bar

There's a way to do this just within regex right? I don't have to go and use strings etc. do I?

RESULT:

NOTE I can't do a nice highlighting but the bold gives you an idea (although the spaces that are before and after would also be selected but it breaks the bolding).

ipsum dolor foo bar Lorem ipsum dolor sit amet,
consectetur adipisicing elit, sed do
eiusmod tempor foo bar incididunt ut labore et
dolore foo bar

Assume PCRE nomenclature.


UPDATE 7/29/2013: it may be better to use a search and replace function in your language of choice to just 'remove' the phrases that you don't want so that you are then left with the info you do want.

like image 556
Keng Avatar asked Dec 31 '25 09:12

Keng


1 Answers

In general, if foobar matches itself, then (?s:(?!foobar).)* matches anything that is not foobar, including nothing at all.

You could use that to find lines that don’t have foobar in them, for example, using

^(?:(?!foobar).)*$

You could also use your language’s split() function to split on foobar, which will give you all the pieces that do not include the split pattern.

Regarding the nasty little-known backtracking control verbs like (*FAIL) and (*COMMIT), I haven’t yet had much occasion to use them in ‘non-toy’ programs. I find that independent subexpressions via (?>...) and the possessive quantifiers *+, ++, ?+ etc. give me more than enough rope, so to speak.

That said, I do have one toy example of using (*FAIL) in this answer; it’s the very first regex solution. The reason for its being there was I wanted to force the regex engine to backtrack through all possible permutations; the real goal was merely to count how many ways it tried things.

Please understand that my two regexes there, along with the many, many incredibly creative answers from others, are all meant to be fun, tongue-in-cheek things. Still, one can learn a lot from them — once one recovers from shock. ☺

like image 156
tchrist Avatar answered Jan 02 '26 01:01

tchrist