Question

I want to match the docstrings of a Python file. Eg.

r""" Hello this is Foo
     """

Using only """ should be enough for the start.

>>> data = 'r""" Hello this is Foo\n     """'
>>> def display(m):
...     if not m:
...             return None
...     else:
...             return '<Match: %r, groups=%r>' % (m.group(), m.groups())
...
>>> import re
>>> print display(re.match('r?"""(.*?)"""', data, re.S))
<Match: 'r""" Hello this is Foo\n     """', groups=(' Hello this is Foo\n     ',)>
>>> print display(re.match('r?(""")(.*?)\1', data, re.S))
None

Can someone please explain to me why the first expression matches and the other does not?

Était-ce utile?

La solution

You are using the escape sequence \1 instead of the backreference \1.

You can fix this by changing to escaping the \ before 1.

print display(re.match('r?(""")(.*?)\\1', data, re.S))

You can also fix it by using a raw string for your regex, with no escape sequences.

print display(re.match(r'r?(""")(.*?)\1', data, re.S))

Autres conseils

I think you might be missing the re.DOTALL or re.MULTILINE flags. In this case a re.DOTALL should allow your regex .*? to match newlines as well

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top