문제

When I read about git-rebase, I understood the rebased commits should get lost. I say should because I noticed that, knowing the rebased commit sha, I can recall it.
Suppose I have the following three commits

A -> B -> C

where C's sha is cshaid. Then, if I interactively rebase fixing-up C into B with git rebase -i HEAD~2 and then I check the result with git log, I obtain the expected result, meaning

A -> B'

where B''s sha is different from B's sha.
However, running git log cshaid shows again

A -> B -> C

Questions: is this a known behavior? I tried reading git rebase --help but couldn't find related info. Why rebased commits are not simply forgot? I mean, rebase is kind of a dangerous operation to be performed only if you know what you are doing and you can do it, which is the point in having a dirty index (or wherever these useless commits are kept)? Am I missing something?
Step to reproduce (and to better understand my doubts). If you are willing to reproduce the situation, try with:

  1. mkdir sampledir && cd sampledir && git init
  2. touch file && git add -A . && git commit -m "Initial"
  3. edit file, then git commit -am "First modification"
  4. edit file, then git commit -am "Second modification"
  5. git log, you will see three commits, remember the sha for Second modification
  6. git rebase -i HEAD~2, the fixup Second modification into First modification
  7. git log, you will see two commits, where the sha for First modification is now different than in step 5
  8. however, git log sha-for-"Second modification" will show the exact same tree as point 5 in this list
도움이 되었습니까?

해결책

Yes, this is the expected behavior. Unreferenced commits will eventually be garbage collected and thereby purged from disk. They're kept around for a number of days (by default 14), but before that 14-day timer even starts ticking the objects must have expired from the reflog as well (unreachable objects by default expire after 30 days).

Related StackOverflow questions:

다른 팁

... I understood the rebased commits should get lost

They're not lost, they're (deliberately) "abandoned" (my term).

It's true that rebase copies (the contents of) the old commits. In fact, except for special optimizations and such, it's basically identical to doing git cherry-pick (and the interactive rebase script uses git cherry-pick for each "pick" operation, and amend-style commits for "squash" and "fixup" operations).

When and whether commits in a repository are visible, however, is decided by something else entirely. Normally git log starts with the name of the branch you're on, as recorded in HEAD (there's a file in your .git directory called HEAD, which contains the string ref: refs/heads/master, and that's how git knows that you're "on branch master").1

Given a branch name, git turns that into a (single) commit by "reading the reference":

$ git rev-parse master   # note: you can also rev-parse HEAD directly
676699a0e0cdfd97521f3524c763222f1c30a094

The log command can then read the commit object by its SHA-1. That commit object has some parent SHA-1s, and git log reads those too, and so on, until it reaches a commit that has no parents (a "root" commit).

So, given a root commit A, and second and third commits B and C—plus a label, master, that points to C:

A <-- B <-- C   <-- HEAD=master

(the arrows here show who points to whom, it goes the other way than in your drawing!), git can find (reach) commits A through C, starting at C and working backwards.

The rebase copies B and folds in C, giving B' as you expected:

A <-- B <-- C
 ^
  \
    B'

What makes B' show up with git log is that the label, master, is "peeled off" of commit C and "pasted onto" commit B'. More precisely, the file for branch master (.git/refs/heads/master)2 gets rewritten with the new SHA-1 for B':

A <-- B <-- C   [no label, "abandoned"]
 ^
  \
    B'   <-- HEAD=master

As the answers that beat me to it noted, the "abandoned" commits (along with any other abandoned objects in the repository) are eventually removed for real by the "garbage collector", git gc.

The claim that there's "no" label is a little overblown, though. There's at least one label, hidden away in the "reflog", that keeps commit C from being garbage-collected. And, if you create a branch or tag label that refers to C, either before or after the rebase moves the master label, that label will also keep C in the repository, accessible by "ordinary" name, and you'll see it with git log --all (which looks at all branch and tag names, rather than just the one in HEAD).


1The HEAD file can instead contain a raw SHA-1. In this case you have what git calls a "detached HEAD": you're at a commit by its SHA-1, rather than its branch-name.

2Branch and tag names (really, any reference at all) can be "packed", in which case the separate file goes away. This saves space, and you're not supposed to depend on the existence of the separate file. However, once a branch becomes "active"—being updated a lot—the separate file will re-appear since it's faster and easier to update that one file, than to update the packed-refs file.

git rebase, like other commands that alter history in a destructive way, removes the reference to the obsolete commit, but doesn't cause it to be immediately deleted. git gc, which is automatically executed periodically during the course of normal operation, will (eventually) delete the actual commit data from .git/objects (although the reflog will keep a reference to the commit alive for some time).

This is a safety feature; it makes it quite difficult to actually lose data with git. If you really want to make sure something is gone -- for example, if you've accidentally committed a gigantic file and you want to get back the disk space -- you need to expire the reflog entries and run git gc manually:

git reflog expire --expire=now --all
git gc --aggressive --prune=now

I understood the rebased commits should get lost.

Nope. Commits don't get lost. In a busy repo, git will eventually garbage-collect things that are completely unreachable from any ref and have been for a month or more, but other than git gc, git operations only add to the history graph.

Moving labels around has no effect at all on the actual histories in your repo.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top