Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: Match word not containing

Tags:

regex

I have the following words:

EFI Internal Shell
EFI Hard Drive
EFI Drive

I want to match words that contain EFI but not containing Drive. So only the top word (EFI Internal Shell) should match.

How can this be done in regex?

I looked through SO and none of the answers were able to get me on the right track.

For example: Regular expression that doesn't contain certain string says to use ^((?!my string).)*$ but that didn't work. Even to match any string not containing Drive.

Any tips?

like image 953
dukevin Avatar asked Oct 15 '15 22:10

dukevin


People also ask

How do you exclude a word in regex?

If you want to exclude a certain word/string in a search pattern, a good way to do this is regular expression assertion function. It is indispensable if you want to match something not followed by something else. ?= is positive lookahead and ?! is negative lookahead.

How do you use negation in regex?

Similarly, the negation variant of the character class is defined as "[^ ]" (with ^ within the square braces), it matches a single character which is not in the specified or set of possible characters. For example the regular expression [^abc] matches a single character except a or, b or, c.


1 Answers

Your ^((?!Drive).)*$ did not work at all because you tested against a multiline input.

You should use /m modifier to see what the regex matches. It just matches lines that do not contain Drive, but that tempered greedy token does not check if EFI is inside the string.

Actually, the $ anchor is redundant here since .* matches any zero or more characters other than line break characters. You may simply remove it from your pattern.

(NOTE: In .NET, you will need to use [^\r\n]* instead of .* since . in a .NET pattern matches any char but a newline, LF, char, and matches all other line break chars, like a carriage return, CR, etc.).

Use something like

^(?!.*Drive).*EFI.*

Or, if you need to only fail the match if a Drive is present as a whole word:

^(?!.*\bDrive\b).*EFI.*

Or, if there are more words you want to signal the failure with:

^(?!.*(?:Drive|SomethingElse)).*EFI.*
^(?!.*\b(?:Drive|SomethingElse)\b).*EFI.*

See regex demo

Here,

  • ^ - matches start of string
  • (?!.*Drive) - makes sure there is no "Drive" in the string (so, Drives are NOT allowed)
  • (?!.*\bDrive\b) - makes sure there is no "Drive" as a whole word in the string (so, Drives are allowed)
  • .* - any 0+ chars other than line break chars, as many as possible
  • EFI - anEFI substring
  • .* - any 0+ chars other than line break chars, as many as possible.

If your string has newlines, either use a /s dotall modifier or replace . with [\s\S].

like image 76
Wiktor Stribiżew Avatar answered Nov 15 '22 21:11

Wiktor Stribiżew