Since it is a multiline string, you need to use re.DOTALL
option, like this
p = re.compile('[A-Z]+:.*?(?=[A-Z]+:|$)', re.DOTALL)
Output
set(["DEALER: 'S up, Bubbless?\n",
'JUNKIE: Well, what you got?\n',
'DEALER: Well, there you go.\n',
'DEALER: I got some starters. ',
'BUBBLES: Hey.\n'])
Quoting from re.DOTALL
docs,
Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline.
So, without that option, .*?
doesn't match \n
. That's why none of them other strings got matched.