Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex for letters or numbers in brackets

Tags:

java

regex

I am using Java to process text using regular expressions. I am using the following regular expression

^[\([0-9a-zA-Z]+\)\s]+

to match one or more letters or numbers in parentheses one or more times. For instance, I like to match (aaa) (bb) (11) (AA) (iv) or (111) (aaaa) (i) (V)

I tested this regular expression on http://java-regex-tester.appspot.com/ and it is working. But when I use it in my code, the code does not compile. Here is my code:

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class Tester {

        public static void main(String[] args) {

            Pattern pattern = Pattern.compile("^[\([0-9a-zA-Z]+\)\s]+");

            String[] words = pattern.split("(a) (1) (c) (xii) (A) (12) (ii)");

            String w = pattern.

            for(String s:words){

                System.out.println(s);

            }
    }
}

I tried to use \ instead of \ but the regex gave different results than what I expected (it matches only one group like (aaa) not multiple groups like (aaa) (111) (ii).

Two questions:

  1. How can I fix this regex and be able to match multiple groups?
  2. How can I get the individual matches separately (like (aaa) alone and then (111) and so on). I tried pattern.split but did not work for me.
like image 719
user1787222 Avatar asked Jan 12 '23 08:01

user1787222


2 Answers

Firstly, you want to escape any backslashes in the quotation marks with another backslash. The Regex will treat it as a single backslash. (E.g. call a word character \w in quotation marks, etc.)

Secondly, you got to finish the line that reads:

String w = pattern.

That line explains why it doesn't compile.

like image 134
La-comadreja Avatar answered Jan 24 '23 09:01

La-comadreja


Here is my final solution to match the individual groups of letters/numbers in brackets that appear at the beginning of a line and ignore the rest

import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Tester {
    static ArrayList<String> listOfEnums;
    public static void main(String[] args) {
        listOfEnums = new ArrayList<String>();
        Pattern pattern = Pattern.compile("^\\([0-9a-zA-Z^]+\\)");
        String p = "(a) (1) (c) (xii) (A) (12) (ii) and the good news (1)";
        Matcher matcher = pattern.matcher(p);
        boolean isMatch = matcher.find();
        int index = 0;
        //once you find a match, remove it and store it in the arrayList. 
        while (isMatch) {
          String s = matcher.group();
          System.out.println(s);
          //Store it in an array
          listOfEnums.add(s);
          //Remove it from the beginning of the string.
          p = p.substring(listOfEnums.get(index).length(), p.length()).trim();
          matcher = pattern.matcher(p);
          isMatch = matcher.find();
          index++;
        }
    }
}
like image 20
user1787222 Avatar answered Jan 24 '23 09:01

user1787222