I have list of keywords entered by the user and they may contains the special characters like $, #, @, ^, &,
etc.
As per my requirement when ever i receive list of text messages i need to search for all the keywords in every message.
We need to match exact keyword.
CASE 1: Simple Keyword - Simple Message
I used \b
to match exact keyword and it works fine.
public static void main(String[] args) {
String patternStr = "(?i)\\bHello\\b";
Pattern pattern = Pattern.compile(patternStr);
List<String> strList = new ArrayList<String>();
strList.add("HHello Message");
strList.add("This is Hello Message ");
strList.add("Now Hellos again.");
for(String str : strList) {
Matcher matcher = pattern.matcher(str);
System.out.println(">> "+matcher.find());
}
}
OUTPUT as Expected
>> false
>> true
>> false
CASE 2 : Simple Keyword - Message with Special Character
Now, if i run above same code for following messages then it didn't work as expected.
List<String> strList = new ArrayList<String>();
strList.add("#Hello Message");
strList.add("This is Hello Message ");
strList.add("Now Hellos again.");
OUTPUT:
true
true
false
Expected OUTPUT
false
true
false
CASE 3 : Keyword & Message with Special Character
If i receive following messages and Keyword is #Hello
.
I wrote following code but it didn't work.
public static void main(String[] args) {
String patternStr = "(?i)\\b#Hello\\b";
Pattern pattern = Pattern.compile(patternStr);
List<String> strList = new ArrayList<String>();
strList.add("HHello Message");
strList.add("This is #Hello Message ");
strList.add("Now Hellos again.");
for(String str : strList) {
Matcher matcher = pattern.matcher(str);
System.out.println(">> "+matcher.find());
}
}
OUTPUT:
>> false
>> false
>> false
Expected OUTPUT:
>> false
>> true
>> false
How can i escape the special characters and resolve CASE 2 and CASE 3
.
Please help.
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" .
\\s - matches single whitespace character. \\s+ - matches sequence of one or more whitespace characters.
You can use this regex /^[ A-Za-z0-9_@./#&+-]*$/.
Case 2 seems the opposite as case 3, so I don't think you can combine the Pattern
s.
For case 2, your Pattern
could look like:
Pattern pattern = Pattern.compile("(\\s|^)Hello(\\s|$)", Pattern.CASE_INSENSITIVE);
In this case we surround the keyword by whitespace or beginning/end of input.
For case 3, your Pattern
could look like:
Pattern pattern = Pattern.compile("[\\$#@\\^&]Hello(\\s|$)", Pattern.CASE_INSENSITIVE);
In this case, we precede the keyword with any of the special characters of your choice (note the escaped reserved characters $
and ^
), then we accept whitespace or the end of input as the character following the keyword.
Use (?:^|\s)
("start of text or whitespace") instead of the first \b
, and (?:$|\s)
("end of text or whitespace") instead of the second \b
in your regex.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With