Pergunta

I want to match the docstrings of a Python file. Eg.

r""" Hello this is Foo
     """

Using only """ should be enough for the start.

>>> data = 'r""" Hello this is Foo\n     """'
>>> def display(m):
...     if not m:
...             return None
...     else:
...             return '<Match: %r, groups=%r>' % (m.group(), m.groups())
...
>>> import re
>>> print display(re.match('r?"""(.*?)"""', data, re.S))
<Match: 'r""" Hello this is Foo\n     """', groups=(' Hello this is Foo\n     ',)>
>>> print display(re.match('r?(""")(.*?)\1', data, re.S))
None

Can someone please explain to me why the first expression matches and the other does not?

Foi útil?

Solução

You are using the escape sequence \1 instead of the backreference \1.

You can fix this by changing to escaping the \ before 1.

print display(re.match('r?(""")(.*?)\\1', data, re.S))

You can also fix it by using a raw string for your regex, with no escape sequences.

print display(re.match(r'r?(""")(.*?)\1', data, re.S))

Outras dicas

I think you might be missing the re.DOTALL or re.MULTILINE flags. In this case a re.DOTALL should allow your regex .*? to match newlines as well

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top