using regular expression substitution command to insert leading zeros in front of numbers less than 10 in a string of filenames

StackOverflow https://stackoverflow.com/questions/18191633

Question

I am having trouble figuring out how to make this work with substitution command, which is what I have been instructed to do. I am using this text as a variable:

text = 'file1, file2, file10, file20'

I want to search the text and substitute in a zero in front of any numbers less than 10. I thought I could do and if statement depending on whether or not re.match or findall would find only one digit after the text, but I can't seem to execute. Here is my starting code where I am trying to extract the string and digits into groups, and only extract the those file names with only one digit:

import re
text = 'file1, file2, file10, file20'
mtch = re.findall('^([a-z]+)(\d{1})$',text)

but it doesn't work

Was it helpful?

Solution 2

You can use:

re.sub('[a-zA-Z]\d,', lambda x: x.group(0)[0] + '0' + x.group(0)[1:], s)

OTHER TIPS

You can use re.sub with str.zfill:

>>> text = 'file1, file2, file10, file20'
>>> re.sub(r'(\d+)', lambda m : m.group(1).zfill(2), text)
'file01, file02, file10, file20'
#or
>>> re.sub(r'([a-z]+)(\d+)', lambda m : m.group(1)+m.group(2).zfill(2), text)
'file01, file02, file10, file20'

Anchors anchor to the beginning and end of strings (or lines, in multi-line mode). What you're looking for are word boundaries. And of course, you don't need the {1} quantifier.

\b([a-z]+)(\d)\b

(Not sure how you plan to use your captures, so I'll leave those alone.)

You have the start and end anchors applied, so the pattern cannot be fully matched.

Try something like this

text = "file1, file2, file3, file4, file10, file20, file100"
print re.sub("(?<=[a-z])\d(?!\d),?", "0\g<0>", text)

will result in

file01, file02, file03, file04, file10, file20, file100

This should work if you have a list like above or a single element name.

Explanation

(?<=[a-z]) - Checks that the previous characters are letters using look behind

\d - matches a single digit

(?!\d) - Checks that there are no more digits using lookahead

,? - allows for an optional comma in the list

0\g<0> - The pattern matches a single digit, so it trivial to add a zero. The \g<0> is the matched group.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top