سؤال

Now, I'm aware of the thousand or so questions that have already been answered regarding punctuation stripping. However mine is a bit different than the ones I've looked at.

I need a code that strips all punctuation, EXCEPT hyphens and single apostrophes.

The code I've found so far is:

import re
def textStrip():
    text = input("Text? ")
    return re.compile('\w+').findall(text)

This works fantastic for stripping all punctuation, now I'm wondering if there is a way to add exceptions to this? Or if someone has a better way all together, that would help tremendously. Thanks!

Sample:

"A tall-ish wall, with trim.I don't want to paint it;"

Would return:

["A", "tall-ish", "wall", "with", "trim", "I", "don't", "want", "to", "paint", "it"]
هل كانت مفيدة؟

المحلول

Put -, ', \w inside [...] (meaning set of characters):

>>> import re
>>> text = "A tall-ish wall, with trim.I don't want to paint it;"
>>> re.findall("[-'\w]+", text)
['A', 'tall-ish', 'wall', 'with', 'trim', 'I', "don't", 'want', 'to', 'paint', 'it']
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top