Parsing a random string looking for repeating sequences using Java and Regex.
Consider strings:
aaabbaaacccbb
I'd like to find a regular expression that will find all the matches in the above string:
aaabbaaacccbb
^^^ ^^^
aaabbaaacccbb
^^ ^^
What is the regex expression that will check a string for any repeating sequences of characters and return the groups of those repeating characters such that group 1 = aaa and group 2 = bb. Also note that I've used an example string but any repeating characters are valid: RonRonJoeJoe ... ... ,, ,,...,,
Throw in an * (asterisk), and it will match everything. Read more. \s (whitespace metacharacter) will match any whitespace character (space; tab; line break; ...), and \S (opposite of \s ) will match anything that is not a whitespace character.
Difference between matches() and find() in Java RegexThe matches() method returns true If the regular expression matches the whole text. If not, the matches() method returns false. Whereas find() search for the occurrence of the regular expression passes to Pattern.
This does it:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = "aaabbaaacccbb";
find(s);
String s1 = "RonRonRonJoeJoe .... ,,,,";
find(s1);
System.err.println("---");
String s2 = "RonBobRonJoe";
find(s2);
}
private static void find(String s) {
Matcher m = Pattern.compile("(.+)\\1+").matcher(s);
while (m.find()) {
System.err.println(m.group());
}
}
}
OUTPUT:
aaa
bb
aaa
ccc
bb
RonRonRon
JoeJoe
....
,,,,
---
The below should work for all requirements. It is actually a combination of a couple of the answers here, and it will print out all of the substrings that are repeated anywhere else in the string.
I set it to only return substrings of at least 2 characters, but it can be easily changed to single characters by changing "{2,}" in the regex to "+".
public static void main(String[] args)
{
String s = "RonSamJoeJoeSamRon";
Matcher m = Pattern.compile("(\\S{2,})(?=.*?\\1)").matcher(s);
while (m.find())
{
for (int i = 1; i <= m.groupCount(); i++)
{
System.out.println(m.group(i));
}
}
}
Output:
Ron
Sam
Joe
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With