Multiple “git add” before “git commit”
Question
Here is my experiments.
git init
echo hello > some.txt
git add some.txt
-- objects
-- f2 (blob "hello")
echo hola > some.txt
git add some.txt
-- objects
-- f2 (blob "hello")
-- 53 (blob "hola")
git commit -m "..."
-- objects
-- f2 (blob "hello")
-- 53 (blob "hola")
-- 5c (tree
"some.txt" -> 53)
-- 61 (commit "tree 5c")
As we can see every "git add" created blob object, and "git commit" commited the last blob 53.
But notice that intermediate blob "f2" is still in the repository. Is there any reason for this? How can I use this blob? Or how can I remove it?
Solution
Whee took me a minute to understand what you were asking :)
Git saves everything for at least a period of time. If you run
git fsck
You should see
dangling blob f2...
It's a design of git to let unreferenced things sit for a while. The idea is that if you "oops" something, the file is still there to be found. It's also a "lazy optimization" where adding something saves the state for committing as a content-addressed file, and committing something is just building reference to those. The cleanup part is separate. You should look at the documentation for git prune and git gc.
By default, it'll get cleaned up in some run of git gc
that happens at least 2 weeks later. Also, the utility of git reflog (often used to salvage commits and rebases that screwed everything up) would be lost in the case of aggressive cleanup.