Question

Is there any built-in or straightforward way to match paths recursively with double asterisk, e.g. like zsh does?

For example, with

path = 'foo/bar/ham/spam/eggs.py'

I can use fnmatch to test it with

fnmatch(path, 'foo/bar/ham/*/*.py'

Although, I would like to be able to do:

fnmatch(path, 'foo/**/*.py')

I know that fnmatch maps its pattern to regex, so in the words case I can roll my own fnmatch with additional ** pattern, but maybe there is an easier way

Was it helpful?

Solution

If you look into fnmatch source code closely, it internally converts the pattern to a regular expression, mapping * into .* (and not [^/]* or similar) and thus does not care anything for directory separators / - unlike UNIX shells:

while i < n:
    c = pat[i]
    i = i+1
    if c == '*':
        res = res + '.*'
    elif c == '?':
        res = res + '.'
    elif c == '[':
        ...

Thus

>>> fnmatch.fnmatch('a/b/d/c', 'a/*/c')
True
>>> fnmatch.fnmatch('a/b/d/c', 'a/*************c')
True

OTHER TIPS

For an fnmatch variant that works on paths, you can use a library called wcmatch which implements a globmatch function that matches a path with the same logic that glob crawls a filesystem with. You can control the enabled features with flags, in this case, we enable GLOBSTAR (using ** for recursive directory search).

>>> from wcmatch import glob
>>> glob.globmatch('some/file/path/filename.txt', 'some/**/*.txt', flags=glob.GLOBSTAR)
True

If you can live without using an os.walk loop, try:

glob2

formic

I personally use glob2:

import glob2
files = glob2.glob(r'C:\Users\**\iTunes\**\*.mp4')

Addendum:

As of Python 3.5, the native glob module supports recursive pattern matching:

import glob
files = glob.iglob(r'C:\Users\**\iTunes\**\*.mp4', recursive=True) 

This snippet adds compatibility for **

import re
from functools import lru_cache
from fnmatch import translate as fnmatch_translate


@lru_cache(maxsize=256, typed=True)
def _compile_fnmatch(pat):
    # fixes fnmatch for recursive ** (for compatibilty with Path.glob)
    pat = fnmatch_translate(pat)
    pat = pat.replace('(?s:.*.*/', '(?s:(^|.*/)')
    pat = pat.replace('/.*.*/', '.*/')
    return re.compile(pat).match


def fnmatch(name, pat):
    return _compile_fnmatch(str(pat))(str(name)) is not None
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top