Question

I would like to remove all punctuation from a filename but keep its file extension intact.

e.g. I want:

Flowers.Rose-Murree-[25.10.11].jpg
Time.Square.New-York-[20.7.09].png

to look like:

Flowers Rose Muree 25 10 11.jpg
Time Square New York 20 7 09.png

I'm trying python:

re.sub(r'[^A-Za-z0-9]', ' ', filename)

But that produces:

Flowers Rose Muree 25 10 11 jpg Time Square New York 20 7 09 png

How do I remove the punctuation but keep the file extension?

Was it helpful?

Solution

There's only one right way to do this:

  1. os.path.splitext to get the filename and the extension
  2. Do whatever processing you want to the filename.
  3. Concatenate the new filename with the extension.

OTHER TIPS

I suggest you to replace each occurrence of [\W_](?=.*\.) with space .

See if this works for you. You can actually do it without Regex

>>> fname="Flowers.Rose-Murree-[25.10.11].jpg"
>>> name,ext=os.path.splitext(fname)
>>> name = name.translate(None,string.punctuation)
>>> name += ext
>>> name
'FlowersRoseMurree251011.jpg'
>>> 

You could use a negative lookahead, that asserts that you are not dealing with a dot that is only followed by digits and letters:

re.sub(r'(?!\.[A-Za-z0-9]*$)[^A-Za-z0-9]', ' ', filename)

@katrielalex beat me to the type of answer, but anyway, a regex-free solution:

In [23]: f = "/etc/path/fred.apple.png"

In [24]: path, filename = os.path.split(f)

In [25]: main, suffix = os.path.splitext(filename)

In [26]: newname = os.path.join(path,''.join(c if c.isalnum() else ' ' for c in main) + suffix)

In [27]: newname
Out[27]: '/etc/path/fred apple.png'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top