Question

System.out.println(
    Arrays.deepToString(
        "abc<def>ghi".split("(?:<)|(?:>)")
    )
);

This prints [abc, def, ghi], as if I had split on "<|>". I want it to print [abc, <def>, ghi]. Is there a way to work some regex magic to accomplish what I want here?


Perhaps a simpler example:

System.out.println(
    Arrays.deepToString(
        "Hello! Oh my!! Good bye!!".split("(?:!+)")
    )
);

This prints [Hello, Oh my, Good bye]. I want it to print [Hello!, Oh my!!, Good bye!!]. `.

Was it helpful?

Solution 3

Thanks to information from Cine, I think these are the answers I'm looking for:

System.out.println(
    Arrays.deepToString(
        "abc<def>ghi<x><x>".split("(?=<)|(?<=>)")
    )
); // [abc, <def>, ghi, <x>, <x>]


System.out.println(
    Arrays.deepToString(
        "Hello! Oh my!! Good bye!! IT WORKS!!!".split("(?<=!++)")
    )
); // [Hello!,  Oh my!!,  Good bye!!,  IT WORKS!!!]

Now, the second one was honestly discovered by experimenting with all the different quantifiers. Neither greedy nor reluctant work, but possessive does.

I'm still not sure why.

OTHER TIPS

You need to take a look at zero width matching constructs:

(?=X)   X, via zero-width positive lookahead
(?!X)   X, via zero-width negative lookahead
(?<=X)  X, via zero-width positive lookbehind
(?<!X)  X, via zero-width negative lookbehind

You can use \b (word boundary) as what to look for as it is zero-width and use that as the anchor for looking for < and >.

String s = "abc<def>ghi";
String[] bits = s.split("(?<=>)\\b|\\b(?=<)");
for (String bit : bits) {
  System.out.println(bit);
}

Output:

abc
<def>
ghi

Now that isn't a general solution. You will probably need to write a custom split method for that.

Your second example suggests it's not really split() you're after but a regex matching loop. For example:

String s = "Hello! Oh my!! Good bye!!";
Pattern p = Pattern.compile("(.*?!+)\\s*");
Matcher m = p.matcher(s);
while (m.find()) {
  System.out.println("[" + m.group(1) + "]");
}

Output:

[Hello!]
[Oh my!!]
[Good bye!!]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top