I want to split the string "aaaabbbccccaaddddcfggghhhh" into "aaaa", "bbb", "cccc". "aa", "dddd", "c", "f" and so on.
I tried this:
String[] arr = "aaaabbbccccaaddddcfggghhhh".split("(.)(?!\\1)");
But this eats away one character, so with the above regular expression I get "aaa" while I want it to be "aaaa" as the first string.
How do I achieve this?
Java has a repeat function to build copies of a source string: String newString = "a". repeat(N); assertEquals(EXPECTED_STRING, newString);
Split(Char, Int32, StringSplitOptions) Splits a string into a maximum number of substrings based on a specified delimiting character and, optionally, options. Splits a string into a maximum number of substrings based on the provided character separator, optionally omitting empty substrings from the result.
Answer: You just have to pass (“”) in the regEx section of the Java Split() method. This will split the entire String into individual characters.
Try this:
String str = "aaaabbbccccaaddddcfggghhhh"; String[] out = str.split("(?<=(.))(?!\\1)"); System.out.println(Arrays.toString(out)); => [aaaa, bbb, cccc, aa, dddd, c, f, ggg, hhhh]
Explanation: we want to split the string at groups of same chars, so we need to find out the "boundary" between each group. I'm using Java's syntax for positive look-behind to pick the previous char and then a negative look-ahead with a back reference to verify that the next char is not the same as the previous one. No characters were actually consumed, because only two look-around assertions were used (that is, the regular expresion is zero-width).
What about capturing in a lookbehind?
(?<=(.))(?!\1|$)
as a Java string:
(?<=(.))(?!\\1|$)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With