Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match word in String in Java

Tags:

java

regex

I'm trying to match Strings that contain the word "#SP" (sans quotes, case insensitive) in Java. However, I'm finding using Regexes very difficult!

Strings I need to match: "This is a sample #sp string", "#SP string text...", "String text #Sp"

Strings I do not want to match: "Anything with #Spider", "#Spin #Spoon #SPORK"

Here's what I have so far: http://ideone.com/B7hHkR .Could someone guide me through building my regexp?

I've also tried: "\\w*\\s*#sp\\w*\\s*" to no avail.

Edit: Here's the code from IDEone:

java.util.regex.Pattern p = 
    java.util.regex.Pattern.compile("\\b#SP\\b", 
        java.util.regex.Pattern.CASE_INSENSITIVE);

java.util.regex.Matcher m = p.matcher("s #SP s");

if (m.find()) {
    System.out.println("Match!");
}
like image 239
Marco Pietro Cirillo Avatar asked Dec 26 '22 12:12

Marco Pietro Cirillo


1 Answers

(edit: positive lookbehind not needed, only matching is done, not replacement)

You are yet another victim of Java's misnamed regex matching methods.

.matches() quite unfortunately so tries to match the whole input, which is a clear violation of the definition of "regex matching" (a regex can match anywhere in the input). The method you need to use is .find().

This is a braindead API, and unfortunately Java is not the only language having such misguided method names. Python also pleads guilty.

Also, you have the problem that \\b will detect on word boundaries and # is not part of a word. You need to use an alternation detecting either the beginning of input or a space.

Your code would need to look like this (non fully qualified classes):

Pattern p = Pattern.compile("(^|\\s)#SP\\b", Pattern.CASE_INSENSITIVE);

Matcher m = p.matcher("s #SP s");

if (m.find()) {
    System.out.println("Match!");
}
like image 111
fge Avatar answered Jan 04 '23 00:01

fge