Ok, classic case of finding the answer just as the question was written...
It is clearly stated in the PLY manual that there is an optimize mode which is used for this exact use case. I thought it referred to another kind of performance optimization.
Note that since -OO removes docstrings, instantiating your lexer and parser with optimize=1 will not handle empty rules like the one below:
def p_commands(self, p):
"""commands :
| commands command"""
# This will fail when running optimized
def p_command(self, p):
"""command : foo
| bar"""
p[0] = p[1]