Question

For example if I have a string of numbers and a list of word:

My_number = ("5,6!7,8")
My_word =["hel?llo","intro"]
Was it helpful?

Solution

Using str.translate:

>>> from string import punctuation
>>> lis = ["hel?llo","intro"]
>>> [ x.translate(None, punctuation) for x in lis]
['helllo', 'intro']
>>> strs = "5,6!7,8"
>>> strs.translate(None, punctuation)
'5678'

Using regex:

>>> import re
>>> [ re.sub(r'[{}]+'.format(punctuation),'',x ) for x in lis]
['helllo', 'intro']
>>> re.sub(r'[{}]+'.format(punctuation),'', strs)
'5678'

Using a list comprehension and str.join:

>>> ["".join([c for c in x if c not in punctuation])  for x in lis]
['helllo', 'intro']
>>> "".join([c for c in strs if c not in punctuation])
'5678'

Function:

>>> from collections import Iterable
def my_strip(args):
    if isinstance(args, Iterable) and not isinstance(args, basestring):
        return [ x.translate(None, punctuation) for x in args]
    else:
        return args.translate(None, punctuation)
...     
>>> my_strip("5,6!7,8")
'5678'
>>> my_strip(["hel?llo","intro"])
['helllo', 'intro']

OTHER TIPS

Assuming you meant for my_number to be a string,

>>> from string import punctuation
>>> my_number = "5,6!7,8"
>>> my_word = ["hel?llo", "intro"]
>>> remove_punctuation = lambda s: s.translate(None, punctuation)
>>> my_number = remove_punctuation(my_number)
>>> my_word = map(remove_punctuation, my_word)
>>> my_number
'5678'
>>> my_word
['helllo', 'intro']

Here's a Unicode aware solution. Po is the Unicode Category for punctuation.

>>> import unicodedata
>>> mystr = "1?2,3!abc"
>>> mystr = "".join([x for x in mystr if unicodedata.category(x) != "Po"])
>>> mystr
'123abc'

You can do it with regex too, using the re module and re.sub. Sadly the standard library regex module doesn't support Unicode Categories, so you would've to specify all the characters you want to remove manually. There's a separate library called regex which has such a feature, but it is non-standard.

Using filter + str.isalnum:

>>> filter(str.isalnum, '5,6!7,8')
'5678'
>>> filter(str.isalnum, 'hel?llo')
'helllo'
>>> [filter(str.isalnum, word) for word in ["hel?llo","intro"]]
['helllo', 'intro']

This works only in python2. In python3 filter will always return an iterable and you have to do ''.join(filter(str.isalnum, the_text))

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top