Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex - Find all matching words that that don't begin with a specific prefix

Tags:

regex

How would I construct a regular expression to find all words that end in a string but don't begin with a string?

e.g. Find all words that end in 'friend' that don't start with the word 'girl' in the following sentence:

"A boyfriend and girlfriend gained a friend when they asked to befriend them"

The items in bold should match. The word 'girlfriend' should not.

like image 810
Luke Baulch Avatar asked Jun 10 '11 15:06

Luke Baulch


People also ask

What is regex match all except a specific word?

Regex Match All Except a Specific Word, Character, or Pattern December 30, 2020 by Benjamin Regex is great for finding specific patterns, but can also be useful to match everything except an unwanted pattern. A regular expression that matches everything except a specific pattern or word makes use of a negative lookahead.

How to find all words starting with specific letters in regex?

Regex find all word starting with specific letters 1 The \b is a word boundary, then p in square bracket [] means the word must start with the letter ‘ p ‘. 2 Next, \w+ means one or more alphanumerical characters after a letter ‘p’ 3 In the end, we used \b to indicate word boundary i.e. end of the word.

How do you match a regular expression that does not contain ignorethis?

For example, here’s an expression that will match any input that does not contain the text “ignoreThis”. /^(?!.*ignoreThis).*/ Note that you can replace the text ignoreThis above with just about any regular expression, including:

How to find all matches in a string using regex in Python?

The RE module’s re.findall () method scans the regex pattern through the entire target string and returns all the matches that were found in the form of a list. Before moving further, let’s see the syntax of the re.findall () method.


2 Answers

Off the top of my head, you could try:

\b             # word boundary - matches start of word
(?!girl)       # negative lookahead for literal 'girl'
\w*            # zero or more letters, numbers, or underscores
friend         # literal 'friend'
\b             # word boundary - matches end of word

Update

Here's another non-obvious approach which should work in any modern implementation of regular expressions:

Assuming you wish to extract a pattern which appears within multiple contexts but you only want to match if it appears in a specific context, you can use an alteration where you first specify what you don't want and then capture what you do.

So, using your example, to extract all of the words that either are or end in friend except girlfriend, you'd use:

\b               # word boundary
(?:              # start of non-capture group 
  girlfriend     # literal (note 1)
|                # alternation
  (              # start of capture group #1 (note 2)
    \w*          # zero or more word chars [a-zA-Z_]
    friend       # literal 
  )              # end of capture group #1
)                # end of non-capture group
\b

Notes:

  1. This is what we do not wish to capture.
  2. And this is what we do wish to capture.

Which can be described as:

  • for all words
  • first, match 'girlfriend' and do not capture (discard)
  • then match any word that is or ends in 'friend' and capture it

In Javascript:

const target = 'A boyfriend and girlfriend gained a friend when they asked to befriend them';

const pattern = /\b(?:girlfriend|(\w*friend))\b/g;

let result = [];
let arr;

while((arr=pattern.exec(target)) !== null){
  if(arr[1]) {
    result.push(arr[1]);
  }
}

console.log(result);

which, when run, will print:

[ 'boyfriend', 'friend', 'befriend' ]
like image 118
Rob Raisch Avatar answered Oct 25 '22 21:10

Rob Raisch


This may work:

\w*(?<!girl)friend

you could also try

\w*(?<!girl)friend\w* if you wanted to match words like befriended or boyfriends.

I'm not sure if ?<! is available in all regex versions, but this expression worked in Expersso (which I believe is .NET).

like image 36
FrustratedWithFormsDesigner Avatar answered Oct 25 '22 22:10

FrustratedWithFormsDesigner