Python re.sub with a flag does not replace all occurrences
-
09-06-2019 - |
Question
The Python docs say:
re.MULTILINE: When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline)... By default, '^' matches only at the beginning of the string...
So what's going on when I get the following unexpected result?
>>> import re
>>> s = """// The quick brown fox.
... // Jumped over the lazy dog."""
>>> re.sub('^//', '', s, re.MULTILINE)
' The quick brown fox.\n// Jumped over the lazy dog.'
Solution
Look at the definition of re.sub
:
sub(pattern, repl, string[, count])
The 4th argument is the count, you are using re.MULTILINE
(which is 8) as the count, not as a flag.
You have to compile your regex if you wish to use flags.
re.sub(re.compile('^//', re.MULTILINE), '', s)
A flags
argument was added in Python 2.7, so the full definition is now:
re.sub(pattern, repl, string[, count, flags])
Which means that:
re.sub('^//', '', s, flags=re.MULTILINE)
works.
OTHER TIPS
re.sub('(?m)^//', '', s)
The full definition of re.sub
is:
re.sub(pattern, repl, string[, count, flags])
Which means that if you tell Python what the parameters are, then you can pass flags
without passing count
:
re.sub('^//', '', s, flags=re.MULTILINE)
or, more concisely:
re.sub('^//', '', s, flags=re.M)