Question

I want to calculate the total working copy size of my currently checked out branch in git (only versioned files; not ignored files or files not yet a descendant of HEAD). Currently I tried this:

$ git ls-files | du -k | cut -f1 | awk '{total += $1} END {print total}'

However this takes an insane amount of time and outputs a number that doesn't seem to make sense:

29800260

I'm not sure what this number is. If there is a faster or more accurate command to do this please let me know.

Was it helpful?

Solution

You're on the right path, but use backticks instead of pipe.

du -k `git ls-files` | awk '{total += $1} END {print total}'

If you have to worry about whitespace, both git ls-files and xargs can talk in nulls:

git ls-files -z | xargs -0 du -k | awk '{total += $1} END {print total}'

It took so long for you because git ls-files | was essentially a no-op, and du -k | was piping output for every file from your current directory down. Exactly what you were trying to avoid!

OTHER TIPS

Note that du shows the amount of disk space that a file takes up. This may vary depending on the file system type and settings. For example, a small 1 byte file may take up a block of 4 KB, or its content may be stored inline with the metadata.

If you want a sum of the exact byte sizes of all the files in the repository instead, you can use this:

git ls-tree -r -l HEAD | cut -d' ' -f 4- | awk '{s+=$1} END {printf "%.0f\n", s}'

This uses the blob size that is stored in Git and also works on a bare repository. The result is typically smaller than the one from du.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top