Question

I use os.listdir and it works fine, but I get sub-directories in the list also, which is not what I want: I need only files.

What function do I need to use for that?

I looked also at os.walk and it seems to be what I want, but I'm not sure of how it works.

Was it helpful?

Solution

You need to filter out directories; os.listdir() lists all names in a given path. You can use os.path.isdir() for this:

basepath = '/path/to/directory'
for fname in os.listdir(basepath):
    path = os.path.join(basepath, fname)
    if os.path.isdir(path):
        # skip directories
        continue

Note that this only filters out directories after following symlinks. fname is not necessarily a regular file, it could also be a symlink to a file. If you need to filter out symlinks as well, you'd need to use not os.path.islink() first.

On a modern Python version (3.5 or newer), an even better option is to use the os.scandir() function; this produces DirEntry() instances. In the common case, this is faster as the direntry loaded already has cached enough information to determine if an entry is a directory or not:

basepath = '/path/to/directory'
for entry in os.scandir(basepath):
    if entry.is_dir():
        # skip directories
        continue
    # use entry.path to get the full path of this entry, or use
    # entry.name for the base filename

You can use entry.is_file(follow_symlinks=False) if only regular files (and not symlinks) are needed.

os.walk() does the same work under the hood; unless you need to recurse down subdirectories, you don't need to use os.walk() here.

OTHER TIPS

Here is a nice little one-liner in the form of a list comprehension:

[f for f in os.listdir(your_directory) if os.path.isfile(os.path.join(your_directory, f))]

This will return a list of filenames within the specified your_directory.

import os
directoryOfChoice = "C:\\" # Replace with a directory of choice!!!
filter(os.path.isfile, os.listdir(directoryOfChoice))

P.S: os.getcwd() returns the current directory.

for fname in os.listdir('.'):
    if os.path.isdir(fname):
       pass  # do your stuff here for directory
    else:
       pass  # do your stuff here for regular file

The solution with os.walk() would be:

for r, d, f in os.walk('path/to/dir'):
    for files in f:
       # This will list all files given in a particular directory

Even though this is an older post, let me please add the pathlib library introduced in 3.4 which provides an OOP style of handling directories and files for sakes of completeness. To get all files in a directory, you can use

def get_list_of_files_in_dir(directory: str, file_types: str ='*') -> list:
    return [f for f in Path(directory).glob(file_types) if f.is_file()]

Following your example, you could use it like this:

mypath = '/path/to/directory'
files = get_list_of_files_in_dir(mypath)

If you only want a subset of files depending on the file extension (e.g. "only csv files"), you can use:

files = get_list_of_files_in_dir(mypath, '*.csv')

Note PEP 471 DirEntry object attributes is: is_dir(*, follow_symlinks=True)

so...

from os import scandir
folder = '/home/myfolder/'
for entry in scandir(folder):
    if entry.is_dir():
        # do code or skip
        continue
    myfile = folder + entry.name
    #do something with myfile

    
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top