I am trying to check whether a string contains a word as a whole, using Java. Below are some examples:
Text : "A quick brown fox"
Words:
"qui" - false
"quick" - true
"quick brown" - true
"ox" - false
"A" - true
Below is my code:
String pattern = "\\b(<word>)\\b";
String s = "ox";
String text = "A quick brown fox".toLowerCase();
System.out.println(Pattern.compile(pattern.replaceAll("<word>", s.toLowerCase())).matcher(text).find());
It works fine with strings like the one I mentioned in the above example. However, I get incorrect results if the input string has characters like %
, (
etc, e.g.:
Text : "c14, 50%; something (in) bracket"
Words:
"c14, 50%;" : false
"(in) bracket" : false
It has something to do with my regex
pattern (or maybe I am doing the entire pattern matching wrongly). Could anyone suggest me a better approach.
It appears you only want to match "words" enclosed with whitespace (or at the start/end of strings).
Use
String pattern = "(?<!\\S)" + Pattern.quote(word) + "(?!\\S)";
The (?<!\S)
negative lookbehind will fail all matches that are immediately preceded with a char other than a whitespace and (?!\s)
is a negative lookahead that will fail all matches that are immediately followed with a char other than whitespace. Pattern.quote()
is necessary to escape special chars that need to be treated as literal chars in the regex pattern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With