Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Regular Expression to Match Exact Word with Special Characters

Tags:

java

string

regex

I have list of keywords entered by the user and they may contains the special characters like $, #, @, ^, &, etc.

As per my requirement when ever i receive list of text messages i need to search for all the keywords in every message.

We need to match exact keyword.

CASE 1: Simple Keyword - Simple Message

I used \b to match exact keyword and it works fine.

public static void main(String[] args) {
        String patternStr =  "(?i)\\bHello\\b";

        Pattern pattern = Pattern.compile(patternStr);

        List<String> strList = new ArrayList<String>();
        strList.add("HHello Message");
        strList.add("This is Hello Message ");
        strList.add("Now Hellos again.");

        for(String str : strList) {
            Matcher matcher = pattern.matcher(str);
            System.out.println(">> "+matcher.find());
        }
    }

OUTPUT as Expected

>> false
>> true
>> false

CASE 2 : Simple Keyword - Message with Special Character

Now, if i run above same code for following messages then it didn't work as expected.

List<String> strList = new ArrayList<String>();
strList.add("#Hello Message");
strList.add("This is Hello Message ");
strList.add("Now Hellos again.");

OUTPUT:

true
true
false

Expected OUTPUT

false
true
false

CASE 3 : Keyword & Message with Special Character

If i receive following messages and Keyword is #Hello. I wrote following code but it didn't work.

public static void main(String[] args) {
        String patternStr =  "(?i)\\b#Hello\\b";

        Pattern pattern = Pattern.compile(patternStr);

        List<String> strList = new ArrayList<String>();
        strList.add("HHello Message");
        strList.add("This is #Hello Message ");
        strList.add("Now Hellos again.");

        for(String str : strList) {
            Matcher matcher = pattern.matcher(str);
            System.out.println(">> "+matcher.find());
        }
    }

OUTPUT:

>> false
>> false
>> false

Expected OUTPUT:

>> false
>> true
>> false

How can i escape the special characters and resolve CASE 2 and CASE 3.

Please help.

like image 477
Ankur Raiyani Avatar asked Aug 04 '13 17:08

Ankur Raiyani


People also ask

How does regex deal with special characters?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" .

What does \\ s+ in Java mean?

\\s - matches single whitespace character. \\s+ - matches sequence of one or more whitespace characters.

How do I allow only special characters in regex?

You can use this regex /^[ A-Za-z0-9_@./#&+-]*$/.


2 Answers

Case 2 seems the opposite as case 3, so I don't think you can combine the Patterns.

For case 2, your Pattern could look like:

Pattern pattern = Pattern.compile("(\\s|^)Hello(\\s|$)", Pattern.CASE_INSENSITIVE);

In this case we surround the keyword by whitespace or beginning/end of input.

For case 3, your Pattern could look like:

Pattern pattern = Pattern.compile("[\\$#@\\^&]Hello(\\s|$)", Pattern.CASE_INSENSITIVE);

In this case, we precede the keyword with any of the special characters of your choice (note the escaped reserved characters $ and ^), then we accept whitespace or the end of input as the character following the keyword.

like image 119
Mena Avatar answered Oct 11 '22 06:10

Mena


Use (?:^|\s) ("start of text or whitespace") instead of the first \b, and (?:$|\s) ("end of text or whitespace") instead of the second \b in your regex.

like image 44
Alex Shesterov Avatar answered Oct 11 '22 05:10

Alex Shesterov