Question

I am trying to find arbitrary files which may contain, among other scary characters, [square brackets]. Since I'm using -iwholename, I assume find treats -iwholename arguments literally, but the following works only when the brackets are escaped.

 $ touch [aoeu]
 $ ls
    [aoeu]
 $ find ./ -type f -iwholename ./[AOEU]
 $ find ./ -type f -iwholename ./\\[AOEU\\]
    ./[aoeu]

I found this answer, but it's talking about regexes, which I am not trying to use. I was just experimenting a bit more; I also realize find is doing other things I don't expect:

 $ touch \*aoeu\*
 $ ls
[aoeu]  *aoeu*
 $ find ./ -type f -iwholename ./\*AOEU\*   # I don't expect this expansion.
./*aoeu*
./[aoeu]
 $ find ./ -type f -iwholename ./\\\*AOEU\\\*            
./*aoeu*

How can I avoid, for arbitrary strings, this non-literal interpretation of some characters? Why does it happen?

EDIT:

Always good to read the 'effing manpage again. This time I found what I missed before:

  -iwholename pattern
          Like -wholename, but the match is case insensitive.

   -wholename pattern
          See -path.    This alternative is less portable than -path.

   -path pattern
          File name matches shell pattern pattern.  The metacharacters do not treat `/' or `.' specially; so, for example,
                    find . -path "./sr*sc"
          will print an entry for a directory called `./src/misc' (if one exists).  To ignore a whole directory tree, use -prune rather than checking every file in the tree.  For example, to skip the directory `src/emacs' and all files and directories under it, and print the names of the other files found, do something like this:
                    find . -path ./src/emacs -prune -o -print
          Note that the pattern match test applies to the whole file name, starting from one of the start points named on the command line.  It would only make sense to use an absolute path name here if the relevant start point is also an absolute path.  This means that this command will never match anything:
                    find bar -path /foo/bar/myfile -print
          The predicate -path is also supported by HP-UX find and will be in a forthcoming version of the POSIX standard.

My "why" question is answered; now, as I said before, "How can I avoid, for arbitrary strings, this non-literal interpretation of some characters?"

Was it helpful?

Solution

In my man find page in BSD I see:

-path pattern
         True if the pathname being examined matches pattern.  Special
         shell pattern matching characters (``['', ``]'', ``*'', and
         ``?'') may be used as part of pattern.  These characters may be
         matched explicitly by escaping them with a backslash (``\'').
         Slashes (``/'') are treated as normal characters and do not have
         to be matched explicitly.

(and -path is the same as -wholename and -iwholename is the same as -path but case insensitive)

You have to escape these characters because they have special meaning to the shell otherwise. This is the same for other flags like -name and -iname.

To make your find work with arbitrary strings, you need to escape these special characters, for example like this:

escaped=$(sed -e 's/[][?*]/\\&/g' <<< "*aoeu*")
find ./ -iwholename "$escaped"

UPDATE

As you yourself figured out, if you need to replace a lot of patterns per second, it will be more efficient to use bash to do the replacement instead of spawning a sed every time, like this:

filename_escaped="${filename//\[/\\[}"
filename_escaped="${filename_escaped//\]/\\]}"
filename_escaped="${filename_escaped//\*/\\*}"
filename_escaped="${filename_escaped//\?/\\?}"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top