Question

I want to consolidate into 1 directory files that are in multiple subdirectories.

The following comes close except that the random string is added after the extension; I want it before the extension:

find . -type f -iname "[a-z,0-9]*" -exec bash -c 'mv -v "$0" "./$( mktemp "$( basename "$0" ).XXX" )"' '{}' \;

I've searched through dozens of other posts but nothing addressed the specifics of my situation:

  • I'm on OS X (so it's a BSD flavor of Bash; for ex. there's no -t option for mv)
  • Many of the files have identical names so I need to rewrite them during the mv (and I can't just use the -n option for mv because there too many files would thus not get moved)
  • The files are not all the same kind, so I need to use a find -type f
  • I want to exclude .DS_store files, so it seems like a good option is find -type f -iname "[a-z,0-9]*"
  • I want the rewritten files's names to be in the form of: oldname-random_string.xyz (but I'm also OK with having the files being renamed as a sequential list: 00001.xyz, 00002.xyz, etc.)
  • The files are buried 4 levels down from my master directory:
    • Master/Top dir
    • Dir 2
    • Dir 3
    • Dir 4
    • Dir 5
    • file
  • For the sake of simplicity I prefer a bash command to a .sh script (but I'm happy with either)
Was it helpful?

Solution

GNU Solution

This uses basically the same command that you were using but I supply a template to mktemp so that the XXX pattern appears just before the suffix. With GNU sed:

find . -type f -iname "[a-z,0-9]*" -exec bash -c 'mv -v "$1" "./$(mktemp -u "$(basename "$1" | sed -E -e '\''s/\.([^.]+)$/.XXX.\1/'\'' -e '\''/XXX/ !s/$/.XXX/'\'')" )"' _ '{}' \;

The key addition above is the use of sed to insert XXX before the suffix in the file name:

sed -E -e 's/\.([^.]+)$/.XXX.\1/' -e '/XXX/ !s/$/.XXX/'

This has two commands. The first puts .XXX before the extension. The second command is run only if the file name has no extension in which case it adds .XXX to the end of the file name.

In the first command, the source regex consists of two parts. The first is \. which matches a period. The second is ([^.]+)$ which captures the extension into group 1. The substitution replaces this with .XXX.\1 where \1 is sed notation for group 1 which, in our case, is the file's extension.

OSX Solution

Under OSX, mktemp is not useful because it only supports templates with the XXX part trailing. As a workaround, we can use a bash script that generates non-overlapping file names:

#!/bin/bash
find . -type f -iname "[a-z,0-9]*" -print0 |
while IFS= read -r -d '' fname
do
    new=$(basename "$fname")
    [ "$fname" = "./$new" ] && continue
    [ "$new" = .DS_store ] && continue
    name=${new%.*}
    ext=${new#"$name"}
    n=0
    new=$(printf '%s.%03i%s' "$name" "$n" "$ext")
    while [ -f "$new" ]
    do
        n=$(($n + 1))
        new=$(printf '%s.%03i%s' "$name" "$n" "$ext")
    done
    mv -v "$fname" "$new"
done

The above uses the find command to get the file names. The option -print0 is used to assure that it works with difficult file names. The while loop reads these file names one by one, into the variable fname. fname includes the full path to the source file. The file name without the path is then stored in new. Then two checks are performed. If the source file is already in the current directory, the script continues on to the next loop. Similarly, if the file name id .DS_Store, it is also skipped. (The find command, as given, already skips these files. This line is there just for future flexibility.) Next, the file name is split into two parts: the name and ext, the extension. ext includes the leading period. Next, a loop checks for files of the form name.NNN.ext and stops at the first one that doesn't yet exist. The source file is moved to a file of that name.

Related Notes Regarding the GNU Solution and its Compatibility

  • Quoting in the above GNU command is complex. The argument to bash -c needs to be in single-quotes to prevent the calling bash from performing premature variable substitution. In addition, the sed commands need to be in single-quotes when executed by the bash subshell to prevent history expansion from interfering with the use of negation, !, within the sed command.

  • The OSX (BSD) sed does not support combining commands together with semicolons. Consequently, each command is supplied to sed via a separate -e option.

  • The OSX (BSD) sed seems to treat + differently from the GNU sed. This incompatibility seems to go away when using the -E (extended regex) option. (The corresponding GNU option is -r but, as an undocumented compatibility feature, GNU sed supports -E also.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top