سؤال

I have the following situation. A directory with really a lot of subdirectories, and each of those subdirectories contains a file of interest that I want to concatenate. e.g.,

my_dir/
    subdir1/
            subsubdir/
                file_of_interest1.txt
                ...
    subdir2/
            subsubdir/
                file_of_interest1.txt
                ...
    ...

Now, I tried using cat my_dir/*/*/*.txt > all.txt
But unfortunately, the subdirectory tree is so large that I get the following error:

bash: /bin/cat: Argument list too long

Is there a clever way to circumvent the problem, e.g., by concatenating the files in smaller chunks? E.g., concatenating 1/3 of the subdirs, then another 1/3 and 1/3, and then joining them alltogether?

هل كانت مفيدة؟

المحلول

Let find go through the files and add as many as possible to each cat invocation's command line:

find . -type f -name '*.txt' -exec cat '{}' + >all.txt

If your find doesn't support -exec ... {} + (which it should if compliant with current versions of the POSIX spec), there's also an approach using GNU extensions to make xargs safe:

find . -type f -name '*.txt' -print0 | xargs -0 cat >all.txt

Using xargs without -0 is unsafe -- it doesn't correctly handle filenames with newlines in that case, among other issues (some but not all of which can be avoided with other options). Think about a malicious user creating a file $'foo \n/etc/passwd' -- you don't want to run the risk of injecting /etc/passwd into your output.

Finally, there's the less-efficient, older way to use find -exec (which invokes a separate copy of cat for each file found):

find . -type f -name '*.txt' -exec cat '{}' ';' >all.txt

...or, at a similar penalty (of invoking cat multiple times), you can simply use a loop in your shell script:

for f in my_dir/*/*/*.txt; do
  cat "$f"
done >all.txt

Note that this does the redirection on the entire loop, rather than (less efficiently) on a per-file basis.


Aside: If using POSIX sh or bash, quoting {} isn't necessary. However, you do need to quote {} if attempting to support zsh, and so I do so here.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top