Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A regex to match a substring that isn't followed by a certain other substring

I need a regex that will match blahfooblah but not blahfoobarblah

I want it to match only foo and everything around foo, as long as it isn't followed by bar.

I tried using this: foo.*(?<!bar) which is fairly close, but it matches blahfoobarblah. The negative look behind needs to match anything and not just bar.

The specific language I'm using is Clojure which uses Java regexes under the hood.

EDIT: More specifically, I also need it to pass blahfooblahfoobarblah but not blahfoobarblahblah.

like image 233
Rayne Avatar asked Apr 13 '10 15:04

Rayne


People also ask

What does \+ mean in regex?

Example: The regex "aa\n" tries to match two consecutive "a"s at the end of a line, inclusive the newline character itself. Example: "a\+" matches "a+" and not a series of one or "a"s. ^ the caret is the anchor for the start of the string, or the negation symbol.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.

How do you check if a string contains a substring using regex?

String indexOf() Method The most common (and perhaps the fastest) way to check if a string contains a substring is to use the indexOf() method. This method returns the index of the first occurrence of the substring. If the string does not contain the given substring, it returns -1.

How do you match a character except one regex?

To match any character except a list of excluded characters, put the excluded charaters between [^ and ] . The caret ^ must immediately follow the [ or else it stands for just itself. The character '. ' (period) is a metacharacter (it sometimes has a special meaning).


1 Answers

Try:

/(?!.*bar)(?=.*foo)^(\w+)$/ 

Tests:

blahfooblah            # pass blahfooblahbarfail     # fail somethingfoo           # pass shouldbarfooshouldfail # fail barfoofail             # fail 

Regular expression explanation

NODE                     EXPLANATION --------------------------------------------------------------------------------   (?!                      look ahead to see if there is not: --------------------------------------------------------------------------------     .*                       any character except \n (0 or more times                              (matching the most amount possible)) --------------------------------------------------------------------------------     bar                      'bar' --------------------------------------------------------------------------------   )                        end of look-ahead --------------------------------------------------------------------------------   (?=                      look ahead to see if there is: --------------------------------------------------------------------------------     .*                       any character except \n (0 or more times                              (matching the most amount possible)) --------------------------------------------------------------------------------     foo                      'foo' --------------------------------------------------------------------------------   )                        end of look-ahead --------------------------------------------------------------------------------   ^                        the beginning of the string --------------------------------------------------------------------------------   (                        group and capture to \1: --------------------------------------------------------------------------------     \w+                      word characters (a-z, A-Z, 0-9, _) (1 or                              more times (matching the most amount                              possible)) --------------------------------------------------------------------------------   )                        end of \1 --------------------------------------------------------------------------------   $                        before an optional \n, and the end of the                            string 

Other regex

If you only want to exclude bar when it is directly after foo, you can use

/(?!.*foobar)(?=.*foo)^(\w+)$/ 

Edit

You made an update to your question to make it specific.

/(?=.*foo(?!bar))^(\w+)$/ 

New tests

fooshouldbarpass               # pass butnotfoobarfail               # fail fooshouldpassevenwithfoobar    # pass nofuuhere                      # fail 

New explanation

(?=.*foo(?!bar)) ensures a foo is found but is not followed directly bar

like image 60
maček Avatar answered Oct 13 '22 02:10

maček