Question

I am trying to delete several files within a directory.

So far I have that code:

   for filename in glob.glob("buffer*" ):
        os.remove(filename) 
    for filename in glob.glob("grid*" ):
        os.remove(filename)
    for filename in glob.glob("OSMroads*" ):
        os.remove(filename)
    for filename in glob.glob("newCostSurface*" ):
        os.remove(filename)
    for filename in glob.glob("standsLine*" ):
        os.remove(filename)
    for filename in glob.glob("standsReprojected*" ):
        os.remove(filename)

Is there a way to do this more efficient?

Was it helpful?

Solution 2

Doing 6 separate glob calls will of course iterate the directory object 6 times.

Fortunately, on almost any platform, it'll probably end up being cached after the first time. Unless your directory is absolutely gigantic, this won't be a noticeable problem.

But since you explicitly asked about efficiency, you can obviously iterate once and filter the results. The easiest way to do this is with fnmatch. All that glob is doing is calling listdir and then fnmatch on each result; you can do the same thing with multiple fnmatch calls:

for filename in os.listdir('.'):
    if fnmatch.fnmatch(filename, 'buffer*'):
        os.remove(filename)
    # etc.

And of course you can simplify this in exactly the same way partofthething simplified your existing code:

for filename in os.listdir('.'):
    for pattern in ['buffer*', 'grid*', 'OSMroads*',
                    'newCostSurface*','standsLine*', 'standsReprojected*']:
        if fnmatch.fnmatch(filename, pattern):
            os.remove(filename)

Or:

for filename in os.listdir('.'):
    if any(fnmatch.fnmatch(filename, pattern)
           for pattern in ['buffer*', 'grid*', 'OSMroads*',
                           'newCostSurface*','standsLine*', 'standsReprojected*']):
        os.remove(filename)

If you really need to squeeze out another tiny fraction of a percent performance, you can use fnmatch.translate to convert each pattern to a regexp, then merge the regexps into an alternation, and compile it, and then apply that regexp object to each filename. But the CPU time for fnmatch compared to the I/O time for reading the directory objects is probably so small the improvement wouldn't even be measurable.

OTHER TIPS

I like using lists so I don't repeat code, like this:

for pattern in ['buffer*','grid*','OSMroads*','newCostSurface*','standsLine*'
                'standsReprojected*']:
    for filename in glob.glob(pattern):
        os.remove(filename)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top