Question

I came across the following command, which nearly does what I need:

find . -type f -print0 | xargs -0 ls -l | sort -k5,5rn > ~/files.txt

Now, I don't have a clue what any of this means (would love an explanation, but not that important).

The one thing I need to add is to not bother with specific folders (i.e. I have a Documents folder with 10s of thousands of Word docs, which is making this command take a long long time).

Can anyone suggest an addition to the above command that will have find ignore a given folder(s)?

Was it helpful?

Solution

Exclude paths matching */Documents/* from find:

find . -type f ! -path "*/Documents/*" -print 0 | ...

OTHER TIPS

Since you asked for an explanation...

find . -type f -print0

That's the find utility, which travels through the file system to find something that matches what you want it to. The . essentially means it will try to find anything, but since you specified -type f it will only find "regular files." -print0, as you may have guessed, simply prints the full path to the standard output (useful for piping). It uses a null character at the end of each line (as opposed to -print, this will be relevant in a moment).

xargs -0 ls -l

xargs takes a list of things from standard input and then executes a given command ("utility") using what is passed to it as an argument. In this case, the utility is the command ls -l so xargs takes the results from find and performs ls -l on them, giving you the long, full path; this is basically just a way to turn your list of files into a list of files with information such as size. The -0 option allows xargs to interpret null characters as the separator between lines, which exists (almost?) solely to allow it to work with the -print0 option above.

sort -k5,5rn > ~/files.txt

sort is pretty cool. It sorts things. -k tells it which column to sort by, in this case column 5 (and only column 5). The rn bit means sort using numbers and reverse the order. The default is largest at the bottom so this puts largest first. Sorting numerically can get confusing if you use unit-suffixes (B, K, M, G, etc.) using ls -lh.

Different options or other ways to find large files:

  • find ~ -size +100M ! -path ~/Documents\* ! -path ~/Library\*
  • find ~ -size +100M | grep -v "^$HOME/Documents/" | while IFS= read -r l; do stat -f'%z %N' "$l"; done | sort -rn
  • shopt -s extglob; find ~/!(Documents) -type f -exec stat -f'%z %N' {} \; | sort -rn | head -n200
  • mdfind 'kMDItemFSSize>=1e8&&kMDItemContentTypeTree!=public.directory' | while IFS= read -r l; do stat -f'%z %N' "$l"; done | sort -rn

You might also just use Finder:

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top