I have a String
that contains 2 or 3 company names each enclosed in parentheses. Each company name can also contains words in parentheses. I need to separate them using regular expressions but didn't find how.
My inputStr
:
(Motor (Sport) (racing) Ltd.) (Motorsport racing (Ltd.)) (Motorsport racing Ltd.)
or
(Motor (Sport) (racing) Ltd.) (Motorsport racing (Ltd.))
The expected result is:
str1 = Motor (Sport) (racing) Ltd.
str2 = Motorsport racing (Ltd.)
str3 = Motorsport racing Ltd.
My code:
String str1, str2, str3;
Pattern p = Pattern.compile("\\((.*?)\\)");
Matcher m = p.matcher(inputStr);
int index = 0;
while(m.find()) {
String text = m.group(1);
text = text != null && StringUtils.countMatches(text, "(") != StringUtils.countMatches(text, ")") ? text + ")" : text;
if (index == 0) {
str1= text;
} else if (index == 1) {
str2 = text;
} else if (index == 2) {
str3 = text;
}
index++;
}
This works great for str2
and str3
but not for str1
.
Current result:
str1 = Motor (Sport)
str2 = Motorsport racing (Ltd.)
str3 = Motorsport racing Ltd.
You can solve this problem without regex; refer to this question about how to find the outermost parentheses.
Here is an example:
import java.util.Stack;
public class Main {
public static void main(String[] args) {
String input = "(Motor (Sport) (racing) Ltd.) (Motorsport racing (Ltd.)) (Motorsport racing Ltd.)";
for (int index = 0; index < input.length(); ) {
if (input.charAt(index) == '(') {
int close = findClose(input, index); // find the close parentheses
System.out.println(input.substring(index + 1, close));
index = close + 1; // skip content and nested parentheses
} else {
index++;
}
}
}
private static int findClose(String input, int start) {
Stack<Integer> stack = new Stack<>();
for (int index = start; index < input.length(); index++) {
if (input.charAt(index) == '(') {
stack.push(index);
} else if (input.charAt(index) == ')') {
stack.pop();
if (stack.isEmpty()) {
return index;
}
}
}
// unreachable if your parentheses is balanced
return 0;
}
}
Output:
Motor (Sport) (racing) Ltd.
Motorsport racing (Ltd.)
Motorsport racing Ltd.
So we can assume that the parentheses can nest at most two levels deep. So we can do it without too much magic. I would go with this code:
List<String> matches = new ArrayList<>();
Pattern p = Pattern.compile("\\([^()]*(?:\\([^()]*\\)[^()]*)*\\)");
Matcher m = p.matcher(inputStr);
while (m.find()) {
String fullMatch = m.group();
matches.add(fullMatch.substring(1, fullMatch.length() - 1));
}
Explanation:
\\(
(?:...)*
we will see some stuff within parentheses, and then some non-parentheses again:\\([^()]*\\)[^()]*
- it's important that we don't allow any more parentheses within the inside parentheses\\)
m.group();
returns the actual full match.fullMatch.substring(1, fullMatch.length() - 1)
removes the parentheses from the start and the end. You could do it with another group too. I just didn't want to make the regex uglier.Why not just solve it using a stack? It will have O(n) complexity only
'('
, push it to the stack and everytime you come across a ')'
, pop from the stack.
else, put the character in a buffer.'('
then that means it is in a company name so also put that in the buffer.')'
in the buffer as it is part of the company name.If the stack is empty after popping, that means that the first company name has ended and the buffer value is the name of the company and clear the buffer.
String string = "(Motor (Sport) (racing) Ltd.) (Motorsport racing (Ltd.)) (Motorsport racing Ltd.)";
List<String> result = new ArrayList();
StringBuffer buffer = new StringBuffer();
Stack<Character> stack = new Stack<Character>();
for (int j = 0; j < string.length(); j++) {
if (string.charAt(j) == '(') {
if (!stack.empty())
buffer.append('(');
stack.push('(');
} else if (string.charAt(j) == ')') {
stack.pop();
if (stack.empty()) {
result.add(buffer.toString());
buffer = new StringBuffer();
}else
buffer.append(')');
}else{
buffer.append(string.charAt(j));
}
}
for(int i=0;i<result.size();i++){
System.out.println(result.get(i));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With