Like the commenter, I would point at Glyph's answer for the multiprocessing question.
With that you could spawn off a fleet of blocking regex matching processes and communicate with them via the childFDs IProcessProtocol.childDataReceived
and IProcessTransport.writeToChild
methods.
This would let your twisted reactor continue to run at full speed and should get you a lot closer to your non-processing numbers (minus the cpu time for managing the extra file descriptors (though that should be tiny as compared letting the regex block the reactor))