Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the exact word using a regex in Java?

Tags:

Consider the following code snippet:

String input = "Print this";
System.out.println(input.matches("\\bthis\\b"));

Output

false

What could be possibly wrong with this approach? If it is wrong, then what is the right solution to find the exact word match?

PS: I have found a variety of similar questions here but none of them provide the solution I am looking for. Thanks in advance.

like image 374
A Null Pointer Avatar asked Feb 27 '12 11:02

A Null Pointer


People also ask

How do you match a word in regex?

To run a “whole words only” search using a regular expression, simply place the word between two word boundaries, as we did with ‹ \bcat\b ›. The first ‹ \b › requires the ‹ c › to occur at the very start of the string, or after a nonword character.

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

What is \b in Java regex?

The subexpression/metacharacter “\b” matches the word boundaries when outside the brackets. Matches the backspace (0x08) when inside the brackets.

What is word character in regex in Java?

The English alphabet (both cases) and, digits (0 to 9) are considered as word characters. You can match them using the meta character “\w”.


2 Answers

When you use the matches() method, it is trying to match the entire input. In your example, the input "Print this" doesn't match the pattern because the word "Print" isn't matched.

So you need to add something to the regex to match the initial part of the string, e.g.

.*\\bthis\\b 

And if you want to allow extra text at the end of the line too:

.*\\bthis\\b.* 

Alternatively, use a Matcher object and use Matcher.find() to find matches within the input string:

    Pattern p = Pattern.compile("\\bthis\\b");     Matcher m = p.matcher("Print this");     m.find();     System.out.println(m.group()); 

Output:

this 

If you want to find multiple matches in a line, you can call find() and group() repeatedly to extract them all.

like image 62
DNA Avatar answered Oct 14 '22 22:10

DNA


Full example method for matcher:

public static String REGEX_FIND_WORD="(?i).*?\\b%s\\b.*?";

public static boolean containsWord(String text, String word) {
    String regex=String.format(REGEX_FIND_WORD, Pattern.quote(word));
    return text.matches(regex);
}

Explain:

  1. (?i) - ignorecase
  2. .*? - allow (optionally) any characters before
  3. \b - word boundary
  4. %s - variable to be changed by String.format (quoted to avoid regex errors)
  5. \b - word boundary
  6. .*? - allow (optionally) any characters after
like image 34
surfealokesea Avatar answered Oct 14 '22 21:10

surfealokesea