Question

I want to split the string "aaaabbbccccaaddddcfggghhhh" into "aaaa", "bbb", "cccc". "aa", "dddd", "c", "f" and so on.

I tried this:

String[] arr = "aaaabbbccccaaddddcfggghhhh".split("(.)(?!\\1)");

But this eats away one character, so with the above regular expression I get "aaa" while I want it to be "aaaa" as the first string.

How do I achieve this?

Was it helpful?

Solution

Try this:

String   str = "aaaabbbccccaaddddcfggghhhh";
String[] out = str.split("(?<=(.))(?!\\1)");

System.out.println(Arrays.toString(out));
=> [aaaa, bbb, cccc, aa, dddd, c, f, ggg, hhhh]

Explanation: we want to split the string at groups of same chars, so we need to find out the "boundary" between each group. I'm using Java's syntax for positive look-behind to pick the previous char and then a negative look-ahead with a back reference to verify that the next char is not the same as the previous one. No characters were actually consumed, because only two look-around assertions were used (that is, the regular expresion is zero-width).

OTHER TIPS

What about capturing in a lookbehind?

(?<=(.))(?!\1|$)

as a Java string:

(?<=(.))(?!\\1|$)

here I am taking each character and Checking two conditions in the if loop i.e String can't exceed the length and if next character is not equaled to the first character continue the for loop else take new line and print it.

for (int i = 0; i < arr.length; i++) {
    char chr= arr[i];
    System.out.print(chr);
    if (i + 1 < arr.length && arr[i + 1] != chr) {
        System.out.print(" \n");
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top