Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need to find a regular expression for any word except word1 or word2

Tags:

regex

Basically I need a regex which will return true if the string is a word (\w+) EXCEPT if it is the word word1 OR word2.

I've tried many things but dont think I'm even close. Help!

like image 926
EdanB Avatar asked Feb 17 '11 14:02

EdanB


People also ask

How do you match everything except a word in regex?

How do you ignore something in regex? To match any character except a list of excluded characters, put the excluded charaters between [^ and ] . The caret ^ must immediately follow the [ or else it stands for just itself.

How do I find a word in a regular expression?

To run a “whole words only” search using a regular expression, simply place the word between two word boundaries, as we did with ‹ \bcat\b ›. The first ‹ \b › requires the ‹ c › to occur at the very start of the string, or after a nonword character.

How do you search for multiple words in a regular expression?

However, to recognize multiple words in any order using regex, I'd suggest the use of quantifier in regex: (\b(james|jack)\b. *){2,} . Unlike lookaround or mode modifier, this works in most regex flavours.

What is ?! In regex?

Definition and Usage. The ?! n quantifier matches any string that is not followed by a specific string n.


3 Answers

^(?!(?:word1|word2)$)\w+$

should do what you need.

(?!...) is a negative lookahead assertion that ensures that it is not possible to match the enclosed expression at the current position.

like image 157
Tim Pietzcker Avatar answered Oct 31 '22 17:10

Tim Pietzcker


There it is:

^(?!word1|word2)\w*
like image 21
pcofre Avatar answered Oct 31 '22 16:10

pcofre


To match any word that is a sequence of one or more letters, digits or underscores (since you mention you want to match all words using \w+) except word1 and word2 you may use a negative lookahead solution with word boundaries \b:

\b(?!(?:word1|word2)\b)\w+

See the regex demo. Note that in PostgreSQL regex, \b must be replaced with \y.

Here are some quick code snippets:

  • scala - """\b(?!(?:word1|word2)\b)\w+""".r.findAllIn(text).toList (see demo)
  • groovy - text.findAll(/\b(?!(?:word1|word2)\b)\w+/) (see demo)
  • kotlin - Regex("""\b(?!(?:word1|word2)\b)\w+""").findAll(text).map{it.value}.toList() (see demo)
  • powershell - select-string -Path $input_path -Pattern '\b(?!(?:word1|word2)\b)\w+' -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file
  • c++ - std::regex rx(R"(\b(?!(?:word1|word2)\b)\w+)"); std::string s = "Extract all words but word1 and word2."; std::vector<std::string> results(std::sregex_token_iterator(s.begin(), s.end(), rx), std::sregex_token_iterator()); (see demo)
  • vb.net - Dim matches() As String = Regex.Matches(text, "\b(?!(?:word1|word2)\b)\w+").Cast(Of Match)().Select(Function(m) m.Value).ToArray()
  • swift - extension String { func matches(regex: String) -> [String] { do { let regex = try NSRegularExpression(pattern: regex, options: []) let nsString = self as NSString let results = regex.matches(in: self, options: [], range: NSRange(location: 0, length: nsString.length)) return results.map { nsString.substring(with: $0.range) } } catch let error { print("invalid regex: \(error.localizedDescription)") return [] } } } print("Extract all words but word1 and word2.".matches(regex: #"\b(?!(?:word1|word2)\b)\w+"#))
  • javascript - text.match(/\b(?!(?:word1|word2)\b)\w+/g) (see demo)
  • r - regmatches(text, gregexpr("(*UCP)\\b(?!(?:word1|word2)\\b)\\w+", text, perl=TRUE)) (see demo) or stringr::str_extract_all(text, "\\b(?!(?:word1|word2)\\b)\\w+") (see demo)
  • ruby - text.scan(/\b(?!(?:word1|word2)\b)\w+/) (see demo)
  • java - Pattern p = Pattern.compile("(?U)\\b(?!(?:word1|word2)\\b)\\w+"); Matcher m = p.matcher(text); List<String> res = new ArrayList<>(); while(m.find()) { res.add(m.group()); } (see demo)
  • php - if (preg_match_all('~\b(?!(?:word1|word2)\b)\w+~u', $text, $matches)) { print_r($matches[0]); } (see demo)
  • python - re.findall(r"\b(?!(?:word1|word2)\b)\w+", text) (see demo)
  • c# - Regex.Matches(text, @"\b(?!(?:word1|word2)\b)\w+").Cast<Match>().Select(x=>x.Value) (see demo)
  • grepbash - grep -oP '\b(?!(?:word1|word2)\b)\w+' file (demo)
  • postgresql - REGEXP_MATCHES(col, '\y(?!(?:word1|word2)\y)\w+', 'g') (demo)
  • perl - @list = ($str =~ m/\b(?!(?:word1|word2)\b)(\w+)/g); (demo)
like image 38
Wiktor Stribiżew Avatar answered Oct 31 '22 17:10

Wiktor Stribiżew