I'm trying to run a regexp in php (preg_match_all
) that matches certain whole words in a string, but problem is that it also matches words that contain only part of a tested word.
Also this is a sub-query in a larger regexp, so other PHP functions like strpos
won't help me, sadly.
String: "I test a string"
Words to match: "testable", "string"
Tried regexp: /([testable|string]+)/
Expected result: "string"
only!
Result: "test", "a", "string"
\w -- (lowercase w) matches a "word" character: a letter or digit or underbar [a-zA-Z0-9_]. Note that although "word" is the mnemonic for this, it only matches a single word char, not a whole word. \W (upper case W) matches any non-word character. \b -- boundary between word and non-word.
If we want to improve the first example to match whole words only, we would need to use \b(cat|dog)\b. This tells the regex engine to find a word boundary, then either cat or dog, and then another word boundary.
To run a “whole words only” search using a regular expression, simply place the word between two word boundaries, as we did with ‹ \bcat\b ›. The first ‹ \b › requires the ‹ c › to occur at the very start of the string, or after a nonword character.
Throw in an * (asterisk), and it will match everything. Read more. \s (whitespace metacharacter) will match any whitespace character (space; tab; line break; ...), and \S (opposite of \s ) will match anything that is not a whitespace character.
If you really want to make sure you only get your words and not words that contain them, then you can use word boundary anchors:
/\b(testable|string)\b/
This will match only a word boundary followed by either testable
or string
and then another word boundary.
You don't want a character class with []
, you just want to match the words:
/testable|string/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With