散列的多个文件

https://stackoverflow.com/questions/1841737

12-09-2019
|

题

问题的规范：

给予一个目录，我想迭代通过的目录及其非隐藏子目录，
并添加一个按摩的散列入非隐藏文件的姓名。
如果剧本是重新运行，它将更换旧的散列一个新。

<filename>.<extension> ==> <filename>.<a-whirlpool-hash>.<extension>

<filename>.<old-hash>.<extension> ==> <filename>.<new-hash>.<extension>

问题：

a)如何要这么做？

b)从所有的方法提供给你，什么使你的方法最适合的？

判决：

谢谢，我已选择SeigeX的回答它的速度和可移植性。
这是emprically速度比其他砸变异体，
和它的工作没有改变我的Mac OS X机。

解决方案

更新，以修正：
1.文件名称中的'['或']'在他们的名字(真的，任何角色。见)
2.处理md5sum当的散列文件反斜杠或行在其名字
3.Functionized散列的检查algo模块化
4.重散列检查的逻辑来消除双重否定

#!/bin/bash
if (($# != 1)) || ! [[ -d "$1" ]]; then
    echo "Usage: $0 /path/to/directory"
    exit 1
fi

is_hash() {
 md5=${1##*.} # strip prefix
 [[ "$md5" == *[^[:xdigit:]]* || ${#md5} -lt 32 ]] && echo "$1" || echo "${1%.*}"
}

while IFS= read -r -d $'\0' file; do
    read hash junk < <(md5sum "$file")
    basename="${file##*/}"
    dirname="${file%/*}"
    pre_ext="${basename%.*}"
    ext="${basename:${#pre_ext}}"

    # File already hashed?
    pre_ext=$(is_hash "$pre_ext")
    ext=$(is_hash "$ext")

    mv "$file" "${dirname}/${pre_ext}.${hash}${ext}" 2> /dev/null

done < <(find "$1" -path "*/.*" -prune -o \( -type f -print0 \))

这个代码具有以下好处超过其他项目迄今为止

它是完全符合Bash版本2.0.2外
没有多余调用其他的二进制文件如sed或查询;使用的系统参数的扩大而不是
使用过程中替换为"发现"，而不是一个管道，没有子壳是由这种方式
需要目录工作，作为一个论点和不完整性检查它
使用$()而不是`符号指令取代，后者是废弃的
工作文件与空间
工作文件与新行
工作文件，与多个扩展
工作文件没有扩展
不遍历的隐藏目录
做不跳过前散列文件，它将重新计算的散列为每规范

测试树

$ tree -a a
a
|-- .hidden_dir
|   `-- foo
|-- b
|   `-- c.d
|       |-- f
|       |-- g.5236b1ab46088005ed3554940390c8a7.ext
|       |-- h.d41d8cd98f00b204e9800998ecf8427e
|       |-- i.ext1.5236b1ab46088005ed3554940390c8a7.ext2
|       `-- j.ext1.ext2
|-- c.ext^Mnewline
|   |-- f
|   `-- g.with[or].ext
`-- f^Jnewline.ext

4 directories, 9 files

结果，

$ tree -a a
a
|-- .hidden_dir
|   `-- foo
|-- b
|   `-- c.d
|       |-- f.d41d8cd98f00b204e9800998ecf8427e
|       |-- g.d41d8cd98f00b204e9800998ecf8427e.ext
|       |-- h.d41d8cd98f00b204e9800998ecf8427e
|       |-- i.ext1.d41d8cd98f00b204e9800998ecf8427e.ext2
|       `-- j.ext1.d41d8cd98f00b204e9800998ecf8427e.ext2
|-- c.ext^Mnewline
|   |-- f.d41d8cd98f00b204e9800998ecf8427e
|   `-- g.with[or].d41d8cd98f00b204e9800998ecf8427e.ext
`-- f^Jnewline.d3b07384d113edec49eaa6238ad5ff00.ext

4 directories, 9 files

其他提示

#!/bin/bash
find -type f -print0 | while read -d $'\0' file
do
    md5sum=`md5sum "${file}" | sed -r 's/ .*//'`
    filename=`echo "${file}" | sed -r 's/\.[^./]*$//'`
    extension="${file:${#filename}}"
    filename=`echo "${filename}" | sed -r 's/\.md5sum-[^.]+//'`
    if [[ "${file}" != "${filename}.md5sum-${md5sum}${extension}" ]]; then
        echo "Handling file: ${file}"
        mv "${file}" "${filename}.md5sum-${md5sum}${extension}"
    fi
done

测试文件，其中包含空间像'b'
测试文件，包含多个扩展，如'。b.c'
测试的目录，含有空间和/或点。
测试文件，其中包含没有扩展的内部目录的含点，例如'。b/c'
更新:现在更新如果哈希该文件的变化。

关键点：

使用 print0 通过管道输送到 while read -d $'\0', ，以正确处理空间在文件名称。
md5sum可以替换您最喜爱的散列函数。Sed删除第一空间和一切之后，它输出的md5sum.
基本文件名是取使用普通的表达，发现最后一个周期，是不是跟另一个斜线(所以这一时期的目录名称并不算作一部分的延伸)。
扩展是通过使用一串起始指数长度的基础文件。

逻辑的要求是复杂的，足以证明所使用的蟒蛇，而不是抨击。它应该提供一个更具可读性，可扩展，并可维护的解决方案。

#!/usr/bin/env python
import hashlib, os

def ishash(h, size):
    """Whether `h` looks like hash's hex digest."""
    if len(h) == size: 
        try:
            int(h, 16) # whether h is a hex number
            return True
        except ValueError:
            return False

for root, dirs, files in os.walk("."):
    dirs[:] = [d for d in dirs if not d.startswith(".")] # skip hidden dirs
    for path in (os.path.join(root, f) for f in files if not f.startswith(".")):
        suffix = hash_ = "." + hashlib.md5(open(path).read()).hexdigest()
        hashsize = len(hash_) - 1
        # extract old hash from the name; add/replace the hash if needed
        barepath, ext = os.path.splitext(path) # ext may be empty
        if not ishash(ext[1:], hashsize):
            suffix += ext # add original extension
            barepath, oldhash = os.path.splitext(barepath) 
            if not ishash(oldhash[1:], hashsize):
               suffix = oldhash + suffix # preserve 2nd (not a hash) extension
        else: # ext looks like a hash
            oldhash = ext
        if hash_ != oldhash: # replace old hash by new one
           os.rename(path, barepath+suffix)

这是一个测试的目录树。它包含：

文件没有扩展的内部目录的一点在他们的名字
文件已经散列在它(试验幂等性)
文件有两个扩展
新行中的名字

$ tree a
a
|-- b
|   `-- c.d
|       |-- f
|       |-- f.ext1.ext2
|       `-- g.d41d8cd98f00b204e9800998ecf8427e
|-- c.ext^Mnewline
|   `-- f
`-- f^Jnewline.ext1

7 directories, 5 files

结果，

$ tree a
a
|-- b
|   `-- c.d
|       |-- f.0bee89b07a248e27c83fc3d5951213c1
|       |-- f.ext1.614dd0e977becb4c6f7fa99e64549b12.ext2
|       `-- g.d41d8cd98f00b204e9800998ecf8427e
|-- c.ext^Mnewline
|   `-- f.0bee89b07a248e27c83fc3d5951213c1
`-- f^Jnewline.b6fe8bb902ca1b80aaa632b776d77f83.ext1

7 directories, 5 files

该方案正常工作对于所有情况。

漩涡的散列是不是在蟒蛇的标准库，但那里都是纯Python和C的扩展，支持它，例如， python-mhash.

安装它：

$ sudo apt-get install python-mhash

使用它：

import mhash

print mhash.MHASH(mhash.MHASH_WHIRLPOOL, "text to hash here").hexdigest()

输出：cbdca4520cc5c131fc3a86109dd23fee2d7ff7be56636d398180178378944a4f41480b938608ae98da7eccbf39a4c79b83a8590c4cb1bace5bc638fc92b3e653

调用 `whirlpooldeep` 在蟒蛇

from subprocess import PIPE, STDOUT, Popen

def getoutput(cmd):
    return Popen(cmd, stdout=PIPE, stderr=STDOUT).communicate()[0]

hash_ = getoutput(["whirlpooldeep", "-q", path]).rstrip()

git 可以提供具有杠杆作用的问题，需要跟踪集的文件的基础上他们的散列。

我是不是真的快乐与我第一个回答，因为正如我所说，这一问题看起来就像是最好的解决perl。你已经说在一个编辑的问题是你有perl在OS X机器要运行这个，所以我给它一个镜头。

很难获得它所有的权利在庆典，即避免任何引述问题的奇怪的文件，并表现很好地角的情况的文件名。

所以在这里它是在perl，一个完整的解决您的问题。它运行了所有文件/目录上列出其命令行。


#!/usr/bin/perl -w
# whirlpool-rename.pl
# 2009 Peter Cordes <peter@cordes.ca>.  Share and Enjoy!

use Fcntl;      # for O_BINARY
use File::Find;
use Digest::Whirlpool;

# find callback, called once per directory entry
# $_ is the base name of the file, and we are chdired to that directory.
sub whirlpool_rename {
    print "find: $_\n";
#    my @components = split /\.(?:[[:xdigit:]]{128})?/; # remove .hash while we're at it
    my @components = split /\.(?!\.|$)/, $_, -1; # -1 to not leave out trailing dots

    if (!$components[0] && $_ ne ".") { # hidden file/directory
        $File::Find::prune = 1;
        return;
    }

    # don't follow symlinks or process non-regular-files
    return if (-l $_ || ! -f _);

    my $digest;
    eval {
        sysopen(my $fh, $_, O_RDONLY | O_BINARY) or die "$!";
        $digest = Digest->new( 'Whirlpool' )->addfile($fh);
    };
    if ($@) {  # exception-catching structure from whirlpoolsum, distributed with Digest::Whirlpool.
        warn "whirlpool: couldn't hash $_: $!\n";
        return;
    }

    # strip old hashes from the name.  not done during split only in the interests of readability
    @components = grep { !/^[[:xdigit:]]{128}$/ }  @components;
    if ($#components == 0) {
        push @components, $digest->hexdigest;
    } else {
        my $ext = pop @components;
        push @components, $digest->hexdigest, $ext;
    }

    my $newname = join('.', @components);
    return if $_ eq $newname;
    print "rename  $_ ->  $newname\n";
    if (-e $newname) {
        warn "whirlpool: clobbering $newname\n";
        # maybe unlink $_ and return if $_ is older than $newname?
        # But you'd better check that $newname has the right contents then...
    }
    # This could be link instead of rename, but then you'd have to handle directories, and you can't make hardlinks across filesystems
    rename $_, $newname or warn "whirlpool: couldn't rename $_ -> $newname:  $!\n";
}


#main
$ARGV[0] = "." if !@ARGV;  # default to current directory
find({wanted => \&whirlpool_rename, no_chdir => 0}, @ARGV );

优点：-实际上使用的漩涡，因此可以使用这个确切的程序。(安装之后libperl摘要-漩涡).容易改变的任何消化功能你想要的，因为不是的不同的节目有不同的输出格式，你有perl摘要共同的接口。

实现所有其他要求：忽视的隐藏文件(下的文件和隐藏的目录).
能够处理任何可能的文件没有错误或安全问题。(几个人得到了这种权利在他们的外壳脚本)。
遵循最佳做法为历目录，通过chdiring下到每个目录(像我以前的答案，查找execdir).这样可以避免的问题PATH_MAX，与目录正在重新命名而你却在运行。
聪明的处理的文件名端。foo..txt...->foo..hash.txt...
处理旧文件含有的散列已经没有重新命名和重新命名他们回来。(它带任何序列的128hex digits这是围绕着"."字。) 在一切修正的情况下，没有盘写入活动的发生，只是读的每一个文件。你目前的解决方案运行mv两次在已经正确命名的情况下，造成目录的元数据写。和正在缓慢，因为这两个进程必须execced.
有效的。没有程序叉/execed，虽然大多数的解决方案实际上将工作结束了有sed的东西每文件。摘要::漩涡是实现本机编译的共用lib，所以它不是慢纯perl。这应该快于运行一个程序上的每一文件，esp。对于小的文件。
Perl支持UTF-8串，所以文件用非ascii characters不应该是一个问题。(不确定，如果任何multi-byte sequences UTF-8中可能包括的字节，这意味着ASCII'.'在其自己的。如果这是可能的，那么你需要UTF-8认识到串的处理。sed不知道UTF-8。庆典的团表达的可能。)
容易扩展。当你去把这个变成一个真正的程序，并要处理更多的情况，可以这样做相当不容易。例如决定要做什么，当你想要重新命名的文件，但哈希命名的文件已经存在。
好的错误报告。大多数的外壳脚本有这样的，不过，通过沿着错误的方案，他们运行。

find . -type f -print | while read file
do
    hash=`$hashcommand "$file"`
    filename=${file%.*}
    extension=${file##*.}
    mv $file "$filename.$hash.$extension"
done

您可能希望将结果存储在一个文件中，像

find . -type f -exec md5sum {} \; > MD5SUMS

如果你真的想每个哈希一个文件：

find . -type f | while read f; do g=`md5sum $f` > $f.md5; done

或甚至

find . -type f | while read f; do g=`md5sum $f | awk '{print $1}'`; echo "$g $f"> $f-$g.md5; done

下面是我采取它，在bash。特点：跳过非正规的文件;与在其名称怪异字符（即空格）的文件正确处理;涉及扩展名的文件名;跳过已经散列文件，因此它可以被重复运行（尽管如果文件被运行之间修改，增加了新的哈希值，而不是换旧人）。我写的使用MD5 -q作为哈希函数;你应该能够用别的来代替这一点，只要它仅输出哈希，不是像文件名=>散。

find -x . -type f -print0 | while IFS="" read -r -d $'\000' file; do
    hash="$(md5 -q "$file")" # replace with your favorite hash function
    [[ "$file" == *."$hash" ]] && continue # skip files that already end in their hash
    dirname="$(dirname "$file")"
    basename="$(basename "$file")"
    base="${basename%.*}"
    [[ "$base" == *."$hash" ]] && continue # skip files that already end in hash + extension
    if [[ "$basename" == "$base" ]]; then
            extension=""
    else
            extension=".${basename##*.}"
    fi
    mv "$file" "$dirname/$base.$hash$extension"
done

在SH或bash，两个版本。一个限制本身的文件具有扩展...

hash () {
  #openssl md5 t.sh | sed -e 's/.* //'
  whirlpool "$f"
}

find . -type f -a -name '*.*' | while read f; do
  # remove the echo to run this for real
  echo mv "$f" "${f%.*}.whirlpool-`hash "$f"`.${f##*.}"
done

...测试

...
mv ./bash-4.0/signames.h ./bash-4.0/signames.whirlpool-d71b117a822394a5b273ea6c0e3f4dc045b1098326d39864564f1046ab7bd9296d5533894626288265a1f70638ee3ecce1f6a22739b389ff7cb1fa48c76fa166.h
...

和此更复杂的版本处理所有纯文本文件，有或没有附加信息，有或没有空格和奇数字符，等等等等...

hash () {
  #openssl md5 t.sh | sed -e 's/.* //'
  whirlpool "$f"
}

find . -type f | while read f; do
  name=${f##*/}
  case "$name" in
    *.*) extension=".${name##*.}" ;;
    *)   extension=   ;;
  esac
  # remove the echo to run this for real
  echo mv "$f" "${f%/*}/${name%.*}.whirlpool-`hash "$f"`$extension"
done

漩涡是不是一种非常常见的散列。您可能会需要安装一个程序来计算它。例如于Debian / Ubuntu包括“漩涡”包。该计划本身打印一个文件的哈希值。容易缓存搜索漩涡表明一些其他软件包支持，包括有趣md5deep。

一些早期anwsers将失败与空格的文件名。如果是这样的话，但你的文件没有任何换行符的文件名，那么你可以放心地使用\ n作为分隔符。


oldifs="$IFS"
IFS="
"
for i in $(find -type f); do echo "$i";done
#output
# ./base
# ./base2
# ./normal.ext
# ./trick.e "xt
# ./foo bar.dir ext/trick' (name "- }$foo.ext{}.ext2
IFS="$oldifs"

尝试没有设置IFS，看看它为什么重要。

我要尝试用IFS =东西“”;找到-print0 |而读-a阵列，分裂的“”人物，但我通常不会使用数组变量。有没有简单的方法，我在手册页看到插入的哈希值作为倒数第二个数组索引，并按下最后一个元素（文件扩展名，如果它有一个。）任何时候bash的数组变量看起来有趣，我知道它的时间做什么我做在Perl，而不是！请参阅使用读取的陷阱： http://tldp.org/LDP/abs/html/gotchas.html# BADREAD0

我决定使用另一种技术我喜欢：找到-exec SH -c。这是最安全的，因为你不能解析的文件名。

此应达到目的：


find -regextype posix-extended -type f -not -regex '.*\.[a-fA-F0-9]{128}.*'  \
-execdir bash -c 'for i in "${@#./}";do 
 hash=$(whirlpool "$i");
 ext=".${i##*.}"; base="${i%.*}";
 [ "$base" = "$i" ] && ext="";
 newname="$base.$hash$ext";
 echo "ext:$ext  $i -> $newname";
 false mv --no-clobber "$i" "$newname";done' \
dummy {} +
# take out the "false" before the mv, and optionally take out the echo.
# false ignores its arguments, so it's there so you can
# run this to see what will happen without actually renaming your files.

-execdir的bash -c“CMD”虚设{} +具有虚设ARG那里，因为该命令之后的第一个参数变得$ 0 shell的位置参数，而不是“$ @”的是用于遍历的一部分。我用execdir聆听，而不是EXEC所以我没有处理目录名（或超过PATH_MAX的名称很长，当实际文件名是不够的所有短嵌套迪尔斯的可能性。）

- 未-regex可防止此被施加两次以相同的文件。虽然漩涡是一个非常长的哈希值，和mv说文件名太长，如果我跑了两次不办理入住手续。（上XFS文件系统。）

文件不带扩展名获得basename.hash。我不得不检查专门以避免追加尾随，或获得基本名称作为扩展名。 ${@#./}去除前面的./是找到在每个文件名前放，所以不存在“”在整个字符串不带扩展名的文件。

MV --no-撞可以是GNU扩展。如果你没有GNU MV，怎么做，如果你想避免删除现有的文件（例如，你一旦运行此，一些相同的文件添加到目录中的旧名称。您再次运行）别的东西OTOH，如果你想要的行为，只是把它拿出来。

我的解决方案甚至应该工作的时候文件名包含一个换行符（他们可以，你懂的！），或任何其他可能的字符。这将是在Perl更快，更容易，但你问的外壳。

瓦伦博尔恩的用于制备一个文件的所有校验（而不是重命名原来的）的解决方案是非常好的，但效率不高。不要每个文件的md5sum运行一次，上一次运行它尽可能多的文件，将适合的命令行：

找到目录型的F -print0 | xargs的-0的md5sum> dir.md5 或者用GNU发现，xargs的是内置的（注意的+而不是“;”）找到目录型的F -exec的md5sum {} +> dir.md5

如果你只是使用find -print | xargs的-d“\ n”，你会在他们引号文件名被搞砸了，所以要小心。如果你不知道哪些文件可能有一天运行该脚本，总是想用print0或-exec。这是ESP。如果为真文件名是由不受信任的用户提供的（即可以是你的服务器上的攻击向量。）

在响应于更新后的问题：

如果任何人都可以对我怎么能避免看在我的bash脚本隐藏目录评论，这将是大加赞赏。

您可以通过使用避免与发现隐藏的目录

find -name '.?*' -prune -o \( -type f -print0 \)

-name '.*' -prune会修剪“”，并停止没有做任何事情。：/

我还是会推荐我的Perl版本，虽然。我更新了它...您可能还需要从CPAN安装摘要::惠而浦，虽然。

嗯，有趣的问题。

试试下面(的mktest功能只是为了测试--TDD击！:)

编辑：

增加支持按摩的散列。
代码清理
更好地引用的文件名
改变列-语法测试的一部分--现在应该有最光辉-像炮弹。注意pdksh不支持：基于膨胀参数(或者说这意味着别的东西)

还注意到，当在md5-模式的失败为名与漩涡-就像哈希，可能反之亦然。

#!/usr/bin/env bash

#Tested with:
# GNU bash, version 4.0.28(1)-release (x86_64-pc-linux-gnu)
# ksh (AT&T Research) 93s+ 2008-01-31
# mksh @(#)MIRBSD KSH R39 2009/08/01 Debian 39.1-4
# Does not work with pdksh, dash

DEFAULT_SUM="md5"

#Takes a parameter, as root path
# as well as an optional parameter, the hash function to use (md5 or wp for whirlpool).
main()
{
  case $2 in
    "wp")
      export SUM="wp"
      ;;
    "md5")
      export SUM="md5"
      ;;
    *)
      export SUM=$DEFAULT_SUM
      ;;
  esac

  # For all visible files in all visible subfolders, move the file
  # to a name including the correct hash:
  find $1 -type f -not -regex '.*/\..*' -exec $0 hashmove '{}' \;
}

# Given a file named in $1 with full path, calculate it's hash.
# Output the filname, with the hash inserted before the extention
# (if any) -- or:  replace an existing hash with the new one,
# if a hash already exist.
hashname_md5()
{
  pathname="$1"
  full_hash=`md5sum "$pathname"`
  hash=${full_hash:0:32}
  filename=`basename "$pathname"`
  prefix=${filename%%.*}
  suffix=${filename#$prefix}

  #If the suffix starts with something that looks like an md5sum,
  #remove it:
  suffix=`echo $suffix|sed -r 's/\.[a-z0-9]{32}//'`

  echo "$prefix.$hash$suffix"
}

# Same as hashname_md5 -- but uses whirlpool hash.
hashname_wp()
{
  pathname="$1"
  hash=`whirlpool "$pathname"`
  filename=`basename "$pathname"`
  prefix=${filename%%.*}
  suffix=${filename#$prefix}

  #If the suffix starts with something that looks like an md5sum,
  #remove it:
  suffix=`echo $suffix|sed -r 's/\.[a-z0-9]{128}//'`

  echo "$prefix.$hash$suffix"
}


#Given a filepath $1, move/rename it to a name including the filehash.
# Try to replace an existing hash, an not move a file if no update is
# needed.
hashmove()
{
  pathname="$1"
  filename=`basename "$pathname"`
  path="${pathname%%/$filename}"

  case $SUM in
    "wp")
      hashname=`hashname_wp "$pathname"`
      ;;
    "md5")
      hashname=`hashname_md5 "$pathname"`
      ;;
    *)
      echo "Unknown hash requested"
      exit 1
      ;;
  esac

  if [[ "$filename" != "$hashname" ]]
  then
      echo "renaming: $pathname => $path/$hashname"
      mv "$pathname" "$path/$hashname"
  else
    echo "$pathname up to date"
  fi
}

# Create som testdata under /tmp
mktest()
{
  root_dir=$(tempfile)
  rm "$root_dir"
  mkdir "$root_dir"
  i=0
  test_files[$((i++))]='test'
  test_files[$((i++))]='testfile, no extention or spaces'

  test_files[$((i++))]='.hidden'
  test_files[$((i++))]='a hidden file'

  test_files[$((i++))]='test space'
  test_files[$((i++))]='testfile, no extention, spaces in name'

  test_files[$((i++))]='test.txt'
  test_files[$((i++))]='testfile, extention, no spaces in name'

  test_files[$((i++))]='test.ab8e460eac3599549cfaa23a848635aa.txt'
  test_files[$((i++))]='testfile, With (wrong) md5sum, no spaces in name'

  test_files[$((i++))]='test spaced.ab8e460eac3599549cfaa23a848635aa.txt'
  test_files[$((i++))]='testfile, With (wrong) md5sum, spaces in name'

  test_files[$((i++))]='test.8072ec03e95a26bb07d6e163c93593283fee032db7265a29e2430004eefda22ce096be3fa189e8988c6ad77a3154af76f582d7e84e3f319b798d369352a63c3d.txt'
  test_files[$((i++))]='testfile, With (wrong) whirlpoolhash, no spaces in name'

  test_files[$((i++))]='test spaced.8072ec03e95a26bb07d6e163c93593283fee032db7265a29e2430004eefda22ce096be3fa189e8988c6ad77a3154af76f582d7e84e3f319b798d369352a63c3d.txt']
  test_files[$((i++))]='testfile, With (wrong) whirlpoolhash, spaces in name'

  test_files[$((i++))]='test space.txt'
  test_files[$((i++))]='testfile, extention, spaces in name'

  test_files[$((i++))]='test   multi-space  .txt'
  test_files[$((i++))]='testfile, extention, multiple consequtive spaces in name'

  test_files[$((i++))]='test space.h'
  test_files[$((i++))]='testfile, short extention, spaces in name'

  test_files[$((i++))]='test space.reallylong'
  test_files[$((i++))]='testfile, long extention, spaces in name'

  test_files[$((i++))]='test space.reallyreallyreallylong.tst'
  test_files[$((i++))]='testfile, long extention, double extention,
                        might look like hash, spaces in name'

  test_files[$((i++))]='utf8test1 - æeiaæå.txt'
  test_files[$((i++))]='testfile, extention, utf8 characters, spaces in name'

  test_files[$((i++))]='utf8test1 - 漢字.txt'
  test_files[$((i++))]='testfile, extention, Japanese utf8 characters, spaces in name'

  for s in . sub1 sub2 sub1/sub3 .hidden_dir
  do

     #note -p not needed as we create dirs top-down
     #fails for "." -- but the hack allows us to use a single loop
     #for creating testdata in all dirs
     mkdir $root_dir/$s
     dir=$root_dir/$s

     i=0
     while [[ $i -lt ${#test_files[*]} ]]
     do
       filename=${test_files[$((i++))]}
       echo ${test_files[$((i++))]} > "$dir/$filename"
     done
   done

   echo "$root_dir"
}

# Run test, given a hash-type as first argument
runtest()
{
  sum=$1

  root_dir=$(mktest)

  echo "created dir: $root_dir"
  echo "Running first test with hashtype $sum:"
  echo
  main $root_dir $sum
  echo
  echo "Running second test:"
  echo
  main $root_dir $sum
  echo "Updating all files:"

  find $root_dir -type f | while read f
  do
    echo "more content" >> "$f"
  done

  echo
  echo "Running final test:"
  echo
  main $root_dir $sum
  #cleanup:
  rm -r $root_dir
}

# Test md5 and whirlpool hashes on generated data.
runtests()
{
  runtest md5
  runtest wp
}

#For in order to be able to call the script recursively, without splitting off
# functions to separate files:
case "$1" in
  'test')
    runtests
  ;;
  'hashname')
    hashname "$2"
  ;;
  'hashmove')
    hashmove "$2"
  ;;
  'run')
    main "$2" "$3"
  ;;
  *)
    echo "Use with: $0 test - or if you just want to try it on a folder:"
    echo "  $0 run path (implies md5)"
    echo "  $0 run md5 path"
    echo "  $0 run wp path"
  ;;
esac

使用的zsh：

$ ls
a.txt
b.txt
c.txt

在魔术：

$ FILES=**/*(.) 
$ # */ stupid syntax coloring thinks this is a comment
$ for f in $FILES; do hash=`md5sum $f | cut -f1 -d" "`; mv $f "$f:r.$hash.$f:e"; done
$ ls
a.60b725f10c9c85c70d97880dfe8191b3.txt
b.3b5d5c3712955042212316173ccf37be.txt
c.2cd6ee2c70b0bde53fbe6cac3c8b8bb1.txt

快乐解构！

编辑：在围绕mv参数子目录和引号添加的文件

红宝石：

#!/usr/bin/env ruby
require 'digest/md5'

Dir.glob('**/*') do |f|
  next unless File.file? f
  next if /\.md5sum-[0-9a-f]{32}/ =~ f
  md5sum = Digest::MD5.file f
  newname = "%s/%s.md5sum-%s%s" %
    [File.dirname(f), File.basename(f,'.*'), md5sum, File.extname(f)]
  File.rename f, newname
end

把手具有空间，没有扩展，并且已经被散列的文件名。

忽略隐藏文件和目录。 - 如果这是所希望的添加File::FNM_DOTMATCH作为glob的第二个参数

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow

散列的多个文件

问题的规范：

问题：

a)如何要这么做？

b)从所有的方法提供给你，什么使你的方法最适合的？

判决：

测试树

结果，

结果，

调用 whirlpooldeep 在蟒蛇

调用 `whirlpooldeep` 在蟒蛇