Is this an efficient way of listing all .mp3 files inside in a directory (including eventual subdirectories) in Python?

StackOverflow https://stackoverflow.com/questions/22255344

  •  11-06-2023
  •  | 
  •  

Domanda

Is this a good approach? Is there a more efficient way to do it (without having to trade code readability for efficiency) ?

for root, dirs, files in os.walk(path, topdown=False):
    for name in files:
        if re.match(r'.*\.mp3', name):
            yield os.path.join(root, name) # returns the path of the .mp3 file

EDIT: Conclusion:

If you ignore recursion, the fastest way to do it is by using the glob module. If you want recursion, switching from re.match() to using slices makes it few milliseconds faster.

È stato utile?

Soluzione

A Python-based recursive directory walker should definitely include os.walk, that is the right choice. However, I would check for the extension using os.path.splitext() instead of using regex. return is not what you want here I guess, it terminates the iteration when hitting the first mp3 file. Replace it with yield. This creates a generator function. Call it from the outside, and you can easily iterate through all mp3 files in your directory tree.

A working solution, test.py:

import os

def mp3gen():
    for root, dirs, files in os.walk('.'):
        for filename in files:
            if os.path.splitext(filename)[1] == ".mp3":
                yield os.path.join(root, filename)

for mp3file in mp3gen():
    print mp3file

Test:

$ mkdir testenv
$ cd testenv
$ mkdir subdir
$ touch test.mp3
$ touch subdir/test2.mp3
$ touch foo.mp4
$ python test.py
./test.mp3
./subdir/test2.mp3

By the way, whatever you do, it is unlikely that the performance of this iteration is the bottleneck in your workflow. If it is, I would actually use the find utility using find . -name "*.mp3", and then pipe its output to your Python script, then read the items from stdin using for line in sys.stdin.

Altri suggerimenti

Note: You can only use this method if Python version is >= 3.5

You can use glob module for this:

import glob
mp3_files = glob.iglob('**/*.mp3', recursive=True)

for mp3 in mp3_list:
    print(mp3)

You can use glob.glob('**/*.mp3', recursive=True) if you want a list instead of a generator.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top