While this does not really answer the question, this explains how lookarounds work.
Lookarounds are anchors: they do not consume text, but find a position in the input text. Your regex can be written in a much more simple manner:
(?<=-)(?<!\\-)|(?=-)(?<!\\)
You have all four lookarounds here: positive and negative lookbehind, positive and negative lookahead.
The full regex reads:
(?<=-) # Find a position where what precedes is a dash
(?<!\\-) # Find a position where what precedes is not \-
| # Or
(?=-) # Find a position where what follows is a dash
(?<!\\) # Find a position where what precedes is not a \
Note the term "position". Note that an anchor will not advance in the text at all.
Now, if we try and match that regex against a-b\-c
:
# Step 1
# Input: | a-b\-c|
# Position: |^ |
# Regex: | (?<=-)(?<!\\-)|(?=-)(?<!\\)|
# Position: |^ |
# No match, try other alternative
# Input: | a-b\-c|
# Position: |^ |
# Regex: |(?<=-)(?<!\\-)| (?=-)(?<!\\)|
# Position: | ^ |
# No match, regex fails
# Advance one position in the input text and try again
# Step 2
# Input: |a -b\-c|
# Position: | ^ |
# Regex: | (?<=-)(?<!\\-)|(?=-)(?<!\\)|
# Position: |^ |
# No match, try other alternative
# Input: |a -b\-c|
# Position: | ^ |
# Regex: |(?<=-)(?<!\\-)| (?=-)(?<!\\)|
# Position: | ^ |
# Match: a "-" follows
# Input: |a -b\-c|
# Position: | ^ |
# Regex: |(?<=-)(?<!\\-)|(?=-) (?<!\\)|
# Position: | ^ |
# Match: what precedes is not a \
# Input: |a -b\-c|
# Position: | ^ |
# Regex: |(?<=-)(?<!\\-)|(?=-)(?<!\\) |
# Position: | ^|
# Regex is satisfied
Here is an alternative which does not use split and no lookarounds:
[a-z]+(\\-[a-z]+)*|-
You can use this regex in a Pattern
and use a Matcher
:
public static void main(final String... args)
{
final Pattern pattern
= Pattern.compile("[a-z]+(\\\\-[a-z]+)*|-");
final Matcher m = pattern.matcher("a-b\\-c");
while (m.find())
System.out.println(m.group());
}