Splitting string using backslash using regex

Question 1

To capture a delimiter, it's easier to use findall instead of split:

re.findall(r'[^\\/]+|[\\/]', string)

[^\\/]+ would find 1 or more occurrences of sub-strings that do not contain forward or backward slash. | works as an or operator. Finally, [\\/] will match with the occurrences of forward and backward slash. The result would provide separate sub-strings for the occurrences of forward and backward slash and string matches where they do not occur.

As for why your code didn't work, your expression is (\\/). When Python interpreter parses this, it sees an escaped slash and creates a string of four characters: ( \ / ). Then, this string is sent to the regex engine, which also does escaping. It sees a slash followed by a backslash, and since backslash is not special, it "escapes" to itself, so the final expression is just (/). Finally, re applies this expression, splits by a backslash and captures it - exactly what you're observing.

The correct command for your approach would be re.split('([\\\/])',string) due to double escaping.

The moral of the story: always use raw literals r"..." with regexes to avoid double escaping issues.

Question 2

I think, this solution gives exactly what you want:

import re
testStr = '-------/--------\\---------/------\\'
parts = re.split('(\\\\|/)', testStr)
for p in parts:
    print('p=' + p)

Result:

p=-------
p=/
p=--------
p=\
p=---------
p=/
p=------
p=\
p=