I have a rather peculiar problem. I'm trying to find a pattern like [some string][word boundary]
. Simplified, my code is:
final Pattern pattern = Pattern.compile(Pattern.quote(someString) + "\\b");
final String value = someString + " ";
System.out.println(pattern.matcher(value).find());
My logic tells me this should always output true
, regardless of what someString
is. However:
someString
ends with a word character (e.g. "abc"), true
is outputted;someString
ends with a word boundary (e.g. "abc."), false
is outputted.Any ideas what is happening? My current workaround is to use \W
instead of \b
, but I'm not sure of the implications.
A dot then a space is not a word boundary.
A word boundary is between a word character, then a non-word character, or visa versa.
ie between [a-zA-Z0-9_][^a-zA-Z0-9_]
or [^a-zA-Z0-9_][a-zA-Z0-9_]
A word boundary is a non-word character that is preceded by a word character or vice versa. The space preceded by a period (2 non-word characters) does not meet this requirement.
The effect of using \W
is that any non-word characters will be matched (the same as \b
, but without the condition that the character is preceded by a word character), which seems correct for your example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With