Try .split
ing on the regex
(?<=[^\.a-zA-Z\d])|(?=[^\.a-zA-Z\d])
It will split the string at any place that is either preceded or followed by a non-alphanumeric character or period.
(?<=[^\.a-zA-Z\d])
is a positive lookbehind. It matches the place between two characters, if the preceding string matches the sub-regex contained within(?<=...)
.[^\.a-zA-Z\d]
is a negated character class. It matches a single character that is not contained within[^...]
.\.
matches the character.
.a-z
matches any lowercase character betweena
andz
.A-Z
is the same, but for uppercase.\d
is the equivalent of[0-9]
, so it matches any digit.
|
is the equivalent of an "or". It makes the regex match either the preceding half of the regex or the following half.(?=[^\.a-zA-Z\d])
is the same as the first half of the regex, except that it is a positive lookahead. It matches the place between two characters, if the following string matches the sub-regex contained within(?=...)
.
You can implement this regex in java like this:
String str = "sin(4+3)-8";
String[] parts = str.split("(?<=[^\\.a-zA-Z\\d])|(?=[^\\.a-zA-Z\\d])");
Result:
["sin","(" 4,"+",3,")","-","8"]