Have a set of string as follows

text:u'MUC-EC-099_SC-Memory-01_TC-25'
text:u'MUC-EC-099_SC-Memory-01_TC-26'
text:u'MUC-EC-099_SC-Memory-01_TC-27'

These data i have extracted from a Xls file and converted to string, now i have to Extract data which is inside single quotes and put them in a list.

expecting output like

[MUC-EC-099_SC-Memory-01_TC-25, MUC-EC-099_SC-Memory-01_TC-26,MUC-EC-099_SC-Memory-01_TC-27]

Thanks in advance.

有帮助吗?

解决方案

Use re.findall:

>>> import re
>>> strs = """text:u'MUC-EC-099_SC-Memory-01_TC-25'
text:u'MUC-EC-099_SC-Memory-01_TC-26'
text:u'MUC-EC-099_SC-Memory-01_TC-27'"""
>>> re.findall(r"'(.*?)'", strs, re.DOTALL)
['MUC-EC-099_SC-Memory-01_TC-25',
 'MUC-EC-099_SC-Memory-01_TC-26',
 'MUC-EC-099_SC-Memory-01_TC-27'
]

其他提示

You can use the following expression:

(?<=')[^']+(?=')

This matches zero or more characters that are not ' which are enclosed between ' and '.

Python Code:

quoted = re.compile("(?<=')[^']+(?=')")
for value in quoted.findall(str(row[1])):
    i.append(value)
    print i

That text: prefix seems a little familiar. Are you using xlrd to extract it? In that case, the reason you have the prefix is because you're getting the wrapped Cell object, not the value in the cell. For example, I think you're doing something like

>>> sheet.cell(2,2)
number:4.0
>>> sheet.cell(3,3)
text:u'C'

To get the unwrapped object, use .value:

>>> sheet.cell(3,3).value
u'C'

(Remember that the u here is simply telling you the string is unicode; it's not a problem.)

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top