for removing all punctuations from a string, x. i want to use re.findall(), but i've been struggling to know what to write in it.. i know that i can get all the punctuations by writing:

import string
y = string.punctuation

but if i write:

re.findall(y,x) 

it says:

 raise error("multiple repeat")
 sre_constants.error: multiple repeat

can someone explain what exactly we should write in re.findall function?

有帮助吗?

解决方案

You may not even need RegEx for this. You can simply use translate, like this

import string
print data.translate(None, string.punctuation)

其他提示

Several characters in string.punctuation have special meaning in regular expression. They should be escaped.

>>> import re
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>> import re
>>> re.escape(string.punctuation)
'\\!\\"\\#\\$\\%\\&\\\'\\(\\)\\*\\+\\,\\-\\.\\/\\:\\;\\<\\=\\>\\?\\@\\[\\\\\\]\\^\\_\\`\\{\\|\\}\\~'

And if you want to match any one of them, use character class ([...])

>>> '[{}]'.format(re.escape(string.punctuation))
'[\\!\\"\\#\\$\\%\\&\\\'\\(\\)\\*\\+\\,\\-\\.\\/\\:\\;\\<\\=\\>\\?\\@\\[\\\\\\]\\^\\_\\`\\{\\|\\}\\~]'

>>> import re
>>> pattern = '[{}]'.format(re.escape(string.punctuation))
>>> re.sub(pattern, '', 'Hell,o World.')
'Hello World'
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top