Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

method matches not work well [duplicate]

Tags:

java

regex

I don't understand why with this regex the method returns false;

Pattern.matches("\\bi", "an is");

the character i is at a word boundary!

like image 406
xdevel2000 Avatar asked Jul 08 '10 09:07

xdevel2000


2 Answers

In Java, matches attempts to match a pattern against the entire string.

This is true for String.matches, Pattern.matches and Matcher.matches.

If you want to check if there's a match somewhere in a string, you can use .*\bi.*. In this case, as a Java string literal, it's ".*\\bi.*".

java.util.regex.Matcher API links

  • boolean matches(): Attempts to match the entire region against the pattern.

What .* means

As used here, the dot . is a regex metacharacter that means (almost) any character. * is a regex metacharacter that means "zero-or-more repetition of". So for example something like A.*B matches A, followed by zero-or-more of "any" character, followed by B (see on rubular.com).

References

  • regular-expressions.info/Repetition with Star and Plus and The Dot Matches (Almost) Any Character

Related questions

  • Difference between .*? and .* for regex

Note that both the . and * (as well as other metacharacters) may lose their special meaning depending on where they appear. [.*] is a character class that matches either a literal period . or a literal asterisk *. Preceded by a backslash also escapes metacharacters, so a\.b matches "a.b".

  • regular-expressions.info/Character Class and Literal Characters and Metacharacters

Related problems

Java does not have regex-based endsWith, startsWith, and contains. You can still use matches to accomplish the same things as follows:

  • matches(".*pattern.*") - does it contain a match of the pattern anywhere?
  • matches("pattern.*") - does it start with a match of the pattern?
  • matches(".*pattern") - does it end with a match of the pattern?

String API quick cheat sheet

Here's a quick cheat sheet that lists which methods are regex-based and which aren't:

  • Non-regex methods:
    • String replace(char oldChar, char newChar)
    • String replace(CharSequence target, CharSequence replacement)
    • boolean startsWith(String prefix)
    • boolean endsWith(String suffix)
    • boolean contains(CharSequence s)
  • Regex methods:
    • String replaceAll(String regex, String replacement)
    • String replaceFirst(String regex, String replacement)
    • String[] split(String regex)
    • boolean matches(String regex)
like image 60
polygenelubricants Avatar answered Sep 22 '22 08:09

polygenelubricants


The whole string has to match if you use matches:

Pattern.matches(".*\\bi.*", "an is")

This allows 0 or more characters before and after. Or:

boolean anywhere = Pattern.compile("\\bi").matcher("an is").find();

will tell you if any substring matches (true in this case). As a note, compiling regexes then keeping them around can improve performance.

like image 21
Matthew Flaschen Avatar answered Sep 24 '22 08:09

Matthew Flaschen