
I am trying to replace variable length items in a list using regex. For example this item "HD479659" should be replaced by "HD0000000479659". I need just to insert 7 0s in between.I have made the following program but every time I run it I got the following error:"TypeError: object of type '_sre.SRE_Pattern' has no len()". Can you please help me how to solve this error.

thank you very much

Here is the program

import xlrd  
import re
import string

wb = xlrd.open_workbook("3_1.xls") 

sh = wb.sheet_by_index(0) 




pat = re.compile(s_pat) 

pat1 = re.compile(s_pat1)

for rownum1 in range(sh.nrows): 

  str1= str(sh.row_values(rownum1))


  m1 = pat.findall(str1)


  for a in m1:


  print >> outfile, m1
도움이 되었습니까?


I think your solution is quite to complicated. This one should do the job and is much simpler:

import re

def repl(match):
    return match.group(1) + ("0"*7) + match.group(2)

print re.sub(r"(HD)([1-9]{1}[0-9]{5})", repl, "HD479659")

See also: http://docs.python.org/library/re.html#re.sub


To transform a list of values, you have to iterate over all values. You don't have to search the matching values first:

import re

values_to_transform = [
    'does not match',
    'but does not matter'

def repl(match):
    return match.group(1) + ("0"*7) + match.group(2)

for value in values_to_transform:
    print re.sub(r"(HD)([1-9]{1}[0-9]{5})", repl, value)

The result is:

does not match
but does not matter

다른 팁

What you need to do is extract the variable length portion of the ID explicitly, then pad with 0's based on the desired length - matched length.

If I understand the pattern correctly you want to use the regex


At that point you can do

results = re.search(...bla...).groupdict()

Which returns the dict {'zeroes': '', 'num':'479659'} in this case. From there you can pad as necessary.

It's 5am at the moment or I'd have a better solution for you, but I hope this helps.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top