Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract overlapping matches using split

Tags:

java

regex

split

How can I extract overlapping matches from an input using String.split()?

For example, if trying to find matches to "aba":

String input = "abababa";
String[] parts = input.split(???);

Expected output:

[aba, aba, aba]
like image 360
Bohemian Avatar asked Feb 27 '26 02:02

Bohemian


2 Answers

String#split will not give you overlapping matches. Because a particular part of the string, will only be included in a unique index, of the array obtained, and not in two indices.

You should use Pattern and Matcher classes here. You can use this regex: -

Pattern pattern = Pattern.compile("(?=(aba))");

And use Matcher#find method to get all the overlapping matches, and print group(1) for it.

The above regex matches every empty string, that is followed by aba, then just print the 1st captured group. Now since look-ahead is zero-width assertion, so it will not consume the string that is matched. And hence you will get all the overlapping matches.

String input = "abababa";
String patternToFind = "aba";

Pattern pattern = Pattern.compile("(?=" + patternToFind + ")");
Matcher matcher = pattern.matcher(input);

while (matcher.find()) {
    System.out.println(patternToFind + " found at index: " + matcher.start());
}

Output: -

aba found at index: 0
aba found at index: 2
aba found at index: 4
like image 126
Rohit Jain Avatar answered Feb 28 '26 16:02

Rohit Jain


I would use indexOf.

for(int i = text.indexOf(find); i >= 0; i = text.indexOf(find, i + 1))
   System.out.println(find + " found at " + i);
like image 33
Peter Lawrey Avatar answered Feb 28 '26 17:02

Peter Lawrey



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!