I have a regex expression that I'm using to find all the words in a given block of content, case insensitive, that are contained in a glossary stored in a database. Here's my pattern:
/($word)/i
The problem is, if I use /(Foo)/i
then words like Food
get matched. There needs to be whitespace or a word boundary on both sides of the word.
How can I modify my expression to match only the word Foo
when it is a word at the beginning, middle, or end of a sentence?
To run a “whole words only” search using a regular expression, simply place the word between two word boundaries, as we did with ‹ \bcat\b ›. The first ‹ \b › requires the ‹ c › to occur at the very start of the string, or after a nonword character.
If we want to improve the first example to match whole words only, we would need to use \b(cat|dog)\b. This tells the regex engine to find a word boundary, then either cat or dog, and then another word boundary.
Choose View->Show Search Options to show the Search Options pane. Expand the Search Mode drop-down and choose Regular Expressions or MS Word Wildcards. You will notice that an icon will appear next to the Source Term and Target Term fields to indicate that you are in the selected mode.
To match whole exact words, use the word boundary metacharacter '\b' . This metacharacter matches at the beginning and end of each word—but it doesn't consume anything. In other words, it simply checks whether the word starts or ends at this position (by checking for whitespace or non-word characters).
Use word boundaries:
/\b($word)\b/i
Or if you're searching for "S.P.E.C.T.R.E." like in Sinan Ünür's example:
/(?:\W|^)(\Q$word\E)(?:\W|$)/i
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With