You may impose a restriction to a PCRE subpattern directly with a (?![',])
negative lookahead that fails the match if the next char to the right is '
or ,
:
[[:space:]]|(?=(?![',])[[:punct:]])
^^^^^^^^
See the regex demo.
Details
[[:space:]]
- any whitespace
|
- or
(?=(?![',])[[:punct:]])
- a positive lookahead that requires that, immediately to the right of the current position, there is no '
and ,
and that there is any 1 punctuation char that is not a '
or ,
(effectively, requiring any punctuation symbol other than '
and ,
).
See the R online demo
X <- "I'm not that good at regex yet, but am getting better!"
strsplit(X, "[[:space:]]|(?=(?![',])[[:punct:]])", perl=TRUE)
[[1]]
[1] "I'm" "not" "that" "good" "at" "regex" "yet,"
[8] "but" "am" "getting" "better" "!"