Question

I'm trying to shrink a Git repo that's around 2GB.

I'm following the "Removing Objects" section of Chapter 9 (9.7) in the ProGit book: http://git-scm.com/book/en/Git-Internals-Maintenance-and-Data-Recovery

After running git gc, then a git verify-pack and git rev-list command, I've found a .tar file that's about 118MB. I don't need it to be in my repo at all. It needs destroying forever.

But when I try to find out which commits have used that file, I get nothing:

git log --oneline -- news/news.tar

Does this mean the file is not in the repository history? And if that's the case can I just leave it or will it still get pushed up as an object when I do git push?

If that's the case, how do I get rid of it.

Was it helpful?

Solution

Use The BFG, not git-filter-branch...

The BFG gives a foolproof method of getting rid of large files, much easier than using git filter-branch, see http://rtyley.github.io/bfg-repo-cleaner/ :

$ bfg  --strip-blobs-bigger-than 100M  my-repo.git

...still here?

If you'd like to try to figure out what's gone awry when you were following the steps from "Removing Objects" in section 9.7 of the ProGit book, there are unfortunately several places where it's possible to go wrong.

The news.tar file is currently kept by one of your branches, tags or some other kind of ref - we know this because the file showed up when you ran git rev-list --objects --all (the '--all' param means look at all refs, beyond just branches - tags, and other exotic types of ref) ...but your git log --oneline -- news/news.tar command does not have an --all, it's just listing your current branch. Your blob is being held by a different reference- probably a different branch or tag.

Seriously, just use The BFG.

Full disclosure: I'm the author of the BFG Repo-Cleaner.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top