Regular Expressions and negating a whole character group [duplicate]

Tags:

regex

People also ask

How do I capture a group in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

How do you negate a set of characters?

Negated Character Classes If you don't want a negated character class to match line breaks, you need to include the line break characters in the class. [^0-9\r\n] matches any character that is not a digit or a line break.

Using a character class such as [^ab] will match a single character that is not within the set of characters. (With the ^ being the negating part).

To match a string which does not contain the multi-character sequence ab, you want to use a negative lookahead:

^(?:(?!ab).)+$

And the above expression disected in regex comment mode is:

(?x)    # enable regex comment mode
^       # match start of line/string
(?:     # begin non-capturing group
  (?!   # begin negative lookahead
    ab  # literal text sequence ab
  )     # end negative lookahead
  .     # any single character
)       # end non-capturing group
+       # repeat previous match one or more times
$       # match end of line/string

Use negative lookahead:

^(?!.*ab).*$

UPDATE: In the comments below, I stated that this approach is slower than the one given in Peter's answer. I've run some tests since then, and found that it's really slightly faster. However, the reason to prefer this technique over the other is not speed, but simplicity.

The other technique, described here as a tempered greedy token, is suitable for more complex problems, like matching delimited text where the delimiters consist of multiple characters (like HTML, as Luke commented below). For the problem described in the question, it's overkill.

For anyone who's interested, I tested with a large chunk of Lorem Ipsum text, counting the number of lines that don't contain the word "quo". These are the regexes I used:

(?m)^(?!.*\bquo\b).+$

(?m)^(?:(?!\bquo\b).)+$

Whether I search for matches in the whole text, or break it up into lines and match them individually, the anchored lookahead consistently outperforms the floating one.

Yes its called negative lookahead. It goes like this - (?!regex here). So abc(?!def) will match abc not followed by def. So it'll match abce, abc, abck, etc.

Similarly there is positive lookahead - (?=regex here). So abc(?=def) will match abc followed by def.

There are also negative and positive lookbehind - (?<!regex here) and (?<=regex here) respectively

One point to note is that the negative lookahead is zero-width. That is, it does not count as having taken any space.

So it may look like a(?=b)c will match "abc" but it won't. It will match 'a', then the positive lookahead with 'b' but it won't move forward into the string. Then it will try to match the 'c' with 'b' which won't work. Similarly ^a(?=b)b$ will match 'ab' and not 'abb' because the lookarounds are zero-width (in most regex implementations).

More information on this page

abc(?!def) will match abc not followed by def. So it'll match abce, abc, abck, etc. what if I want neither def nor xyz will it be abc(?!(def)(xyz)) ???

I had the same question and found a solution:

abc(?:(?!def))(?:(?!xyz))

These non-counting groups are combined by "AND", so it this should do the trick. Hope it helps.

Related questions
                            
                                Add spaces before Capital Letters
                            
                                Fastest way to check a string contain another substring in JavaScript?
                            
                                How to filter rows in pandas by regex
                            
                                How do you implement a good profanity filter?
                            
                                RegEx for matching UK Postcodes
                            
                                Difference between \A \z and ^ $ in Ruby regular expressions
                            
                                Switch statement for string matching in JavaScript
                            
                                RegEx to extract all matches from string using RegExp.exec
                            
                                Why are regular expressions so controversial? [closed]
                            
                                Replace only some groups with Regex
                            
                                Regular expression to get a string between two strings in Javascript
                            
                                php Replacing multiple spaces with a single space [duplicate]
                            
                                Regular expression for matching HH:MM time format
                            
                                Regex that accepts only numbers (0-9) and NO characters [duplicate]
                            
                                Replacing all non-alphanumeric characters with empty strings
                            
                                Using regular expressions to parse HTML: why not?
                            
                                RE error: illegal byte sequence on Mac OS X
                            
                                Regex to validate date formats dd/mm/YYYY, dd-mm-YYYY, dd.mm.YYYY, dd mmm YYYY, dd-mmm-YYYY, dd/mmm/YYYY, dd.mmm.YYYY with Leap Year Support
                            
                                How do I negate a test with regular expressions in a bash script?
                            
                                How to check for valid email address? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regular Expressions and negating a whole character group [duplicate]

Tags:

regex

People also ask

Related questions

Recent Activity

Donate For Us