Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I match text within parentheses using regex?

Tags:

java

regex

I have the following pattern:

(COMPANY) -277.9887 (ASP,) -277.9887 (INC.) 

I want the final output to be:

COMPANY ASP, INC.

Currently I have the following code and it keeps returning the original pattern ( I assume because the group all falls between the first '(' and last ')'

Pattern p = Pattern.compile("((.*))",Pattern.DOTALL);
Matcher matcher = p.matcher(eName);
while(matcher.find())
{
    System.out.println("found match:"+matcher.group(1));
}

I am struggling to get the results I need and appreciate any help. I am not worried about concatenating the results after I get each group, just need to get each group.

like image 223
northpole Avatar asked Aug 26 '09 20:08

northpole


3 Answers

Pattern p = Pattern.compile("\\((.*?)\\)",Pattern.DOTALL);
like image 94
chaos Avatar answered Sep 23 '22 00:09

chaos


Your .* quantifier is 'greedy', so yes, it's grabbing everything between the first and last available parenthesis. As chaos says, tersely :), using the .*? is a non-greedy quantifier, so it will grab as little as possible while still maintaining the match.

And you need to escape the parenthesis within the regex, otherwise it becomes another group. That's assuming there are literal parenthesis in your string. I suspect what you referred to in the initial question as your pattern is in fact your string.

Query: are "COMPANY", "ASP," and "INC." required?

If you must have values for them, then you want to use + instead of *, the + is 1-or-more, the * is zero-or-more, so a * would match the literal string "()"

eg: "((.+?))"

like image 30
ptomli Avatar answered Sep 24 '22 00:09

ptomli


Tested with Java 8: /** * Below Pattern returns the string inside Parenthesis.

* Description about casting regular expression: \(+\s*([^\s)]+)\s*\)+

* \(+ : Exactly matches character "(" at least once
* \s* : matches zero to any number white character.
* ( : Start of Capturing group
* [^\s)]+: match any number of character except ^, ) and spaces.
* ) : Closing of capturing group.
* \s*: matches any white character(0 to any number of character)
* \)*: Exactly matches character ")" at least once.


private static Pattern REGULAR_EXPRESSION = Pattern.compile("\\(+\\s*([^\\s)]+)\\s*\\)+");
like image 27
Chetan Laddha Avatar answered Sep 23 '22 00:09

Chetan Laddha