Question

I'm a Python beginner, so keep in mind my regex skills are level -122.

I need to convert a string with text containing file1 to file01, but not convert file10 to file010.

My program is wrong, but this is the closest I can get, I've tried dozens of combinations but I can't get close:

import re
txt = 'file8, file9, file10'
pat = r"[0-9]"
regexp = re.compile(pat)
print(regexp.sub(r"0\d", txt))

Can someone tell me what's wrong with my pattern and substitution and give me some suggestions?

Was it helpful?

Solution

You could capture the number and check the length before adding 0, but you might be able to use this instead:

import re
txt = 'file8, file9, file10'
pat = r"(?<!\d)(\d)(?=,|$)"
regexp = re.compile(pat)
print(regexp.sub(r"0\1", txt))

regex101 demo

(?<! ... ) is called a negative lookbehind. This prevents (negative) a match if the pattern after it has the pattern in the negative lookbehind matches. For example, (?<!a)b will match all b in a string, except if it has an a before it, meaning bb, cb matches, but ab doesn't match. (?<!\d)(\d) thus matches a digit, unless it has another digit before it.

(\d) is a single digit, enclosed in a capture group, denoted by simple parentheses. The captured group gets stored in the first capture group.

(?= ... ) is a positive lookahead. This matches only if the pattern inside the positive lookahead matches after the pattern before this positive lookahead. In other words, a(?=b) will match all a in a string only if there's a b after it. ab matches, but ac or aa don't.

(?=,|$) is a positive lookahead containing ,|$ meaning either a comma, or the end of the string.

(?<!\d)(\d)(?=,|$) thus matches any digit, as long as there's no digit before it and there's a comma after it, or if that digit is at the end of the string.

OTHER TIPS

how about?

a='file1'    
a='file' + "%02d" % int(a.split('file')[1])

This approach uses a regex to find every sequence of digits and str.zfill to pad with zeros:

>>> txt = 'file8, file9, file10'
>>> re.sub(r'\d+', lambda m : m.group().zfill(2), txt)
'file08, file09, file10'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top