Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegExp confusion

Tags:

java

regex

I am new to Java concept of Regular expression.

Could anyone please tell me the correct regular expression that I should use for the below string -

String exp = "ABCD_123_abc".

and the regular expression that I am using for the above string is:

regExp = "([a-zA-Z]+)_([0-9]+)_([a-z]+)"

But the output of the below code is "**No Match Found**"

Public static void main()
{
   String exp = "ABCD_123_abc";
   String regExp = "([a-zA-Z]+)_([0-9]+)_([a-z]+)";
   Pattern pattern = Pattern.compile(exp);
   Matcher matcher = pattern.matcher(regExp);
   if(matcher.matches())
   {
     System.out.println("Match found");
   }
   else
   {
     System.out.println(" NO Match found");
   }


}
like image 392
Sonia Avatar asked Mar 19 '13 09:03

Sonia


7 Answers

The problem is: you accidentally swapped the use of the regexp pattern and the expression to check

String exp = "ABCD_123_abc";
String regExp = "([a-zA-Z]+)_([0-9]+)_([a-z]+)";

Should be used

Pattern pattern = Pattern.compile(regExp);
Matcher matcher = pattern.matcher(exp);

The Pattern.compile(String regex) function accepts the regular expression.

EDIT

I apologize, my first solution was truly something that must never ever, never ever be done: the names of the variables were contradictory to the meaning of their values... That means pain and tears, and getting hit by angry colleagues while being shouted at. And there is no valid defense to this crime...

EDIT2 You can get the individual matched groups by the Matcher.group(int) function:

String matchedStringpart matcher.group(2);

Notice: I used 2 as the argument:

  • 0 means the input sequence matched
  • 1 means the first group (ABC in this case)
  • ... and so on

If you only need the 123 part, I'd rewrite the regex for clarity:

regExp = "[a-zA-Z]+_([0-9]+)_[a-z]+";

However, in that case, the group() has to be called with 1, as now the first (and only) matched group is the first one:

String matchedStringpart matcher.group(1);
like image 92
ppeterka Avatar answered Oct 12 '22 03:10

ppeterka


You're not compiling the regexp. You need

Pattern pattern = Pattern.compile(regExp);
Matcher matcher = pattern.matcher(exp);

i.e. your above code is confusing the regexp and the input string. Your actual regexp is correct, however.

like image 43
Brian Agnew Avatar answered Oct 12 '22 04:10

Brian Agnew


Your regex is perfectly fine.

The problem comes from the fact that you swapped exp and regExp in your code. The function compile takes as argument a regular expression, whereas the function matcher takes the expression to match.

like image 24
alestanis Avatar answered Oct 12 '22 04:10

alestanis


Your (edited) regexp is fine.

If you want to extract 123, you can use matcher.group(2). That method can only be invoked after matches or find. matcher.group(n) returns the n-th capture group. A capture group is a part of your regexp that is enclosed in parentheses. matcher.group(0) returns the matched string.

Example

if(matcher.matches()) {
  System.out.println(matcher.group(0));
  System.out.println(matcher.group(1));
  System.out.println(matcher.group(2));
  System.out.println(matcher.group(3));
}

prints

 ABCD_123_abc
 ABCD
 123
 abc
like image 36
Javier Avatar answered Oct 12 '22 02:10

Javier


if(exp.matches(regExp))

This alone is enough. You don't need a Pattern/Matcher unless you've some other needs.

like image 39
Rahul Avatar answered Oct 12 '22 03:10

Rahul


In this case if you want to retrieve 123 use the following code :

 System.out.println(matcher.group(2));

This prints output as : 123

Your regex is perfectly fine.

like image 41
Ankur Shanbhag Avatar answered Oct 12 '22 02:10

Ankur Shanbhag


This pattern will work - it matches any number of upper or lower case letter then an underscore then any number of digits then an underscore then any number of upper or lower case letters. If you want to be more specific you can use {n} rather than + to match a specific number of characters.

public static void main(String[] args) {
    final String myString = "ABCD_123_abc";
    final Pattern p = Pattern.compile("^[A-Za-z]++_(\\d++)_[A-Za-z]++$");
    final Matcher matcher = p.matcher(myString);
    if (matcher.matches()) {
        System.out.println(matcher.group(1));
    }
}
like image 33
Boris the Spider Avatar answered Oct 12 '22 04:10

Boris the Spider