Are there crucial differences between stashing, pulling & popping vs. commiting & pull --rebase?

StackOverflow https://stackoverflow.com/questions/21972171

  •  15-10-2022
  •  | 
  •  

Pergunta

My workflow goal is often to avoid a meaningless merge. To accomplish that I often do one of two things:

  1. I commit my changes and git pull --rebase to prevent an unneeded merge. If I later wish to change an existing commit I make changes, commit and merge them in using git rebase --interactive
  2. I simply git stash my changes and then git pull normally and simply git stash pop these changes later when I wish to modify and /or commit them.

Some colleagues warned me that git stash save / git stash pop are unsafe. I am wondering if there are any subtle advantages to using commits and git pull --rebase vs. the stash list.

Foi útil?

Solução

TL;DR version

Using git stash avoids the "meaningless merge" only if you have no commits of your own. The git pull --rebase method is more general.

Long, tutorial-ish version: what's happening, why, etc.

Using git stash just makes some commits,1 so to a first approximation, stashing and committing are basically the same. The stash commits take the form of what I like to call a "stash bag", which is hung off the tip-most commit of whatever branch you're on when you do the stash.

The first difference is that the stash-bag commits do not move the tip of the branch. That is, suppose the branch (let's call it devel) looks like this:

... <- E <- F     <-- devel

then after stashing, it looks like this:

... <- E <- F     <-- devel
            |\
            i-w   <-- "the stash"

(This is an ASCII-art depiction of [part of] the "commit graph": commit F is the tip of the branch. Commit F has, as its parent commit, commit E: F "points back to" E. Meanwhile E points back to its parent, and so on. The name devel simply points to the tip-most commit, which is F.)

If you did a regular commit, you'd get a new commit G, pointing back to F, and devel would be changed to point to G, "moving the tip forward".

The second difference with git stash (vs git commit) occurs at the "other end", as it were. When you run git stash apply,2 what git does is much like the following. (There are many implementation details that make an accurate description tough, but I think this is the way to think about it. If you apply --keep-index it's more complicated; keep in mind that this simplified picture is only "close enough" for the non---keep-index case.)

  1. Diff the stash-bag against the commit it's hung from.
  2. Then, apply that diff as a patch to wherever you are now, using git's merge machinery to do a better job than just a simple straightforward patch. (I.e., git can tell if parts of the patch were already done, and if so, skip them.)

To see how this applies to your situation(s), we have to look at what git pull does, both with and without --rebase.


The pull command is best described as fetch-then-merge. (It even says so in the manual pages for it.) With --rebase, it's best described as fetch-then-rebase. So we have two very different cases with the two different ways of invoking pull.

The fetch step is easy enough to describe. You fetch from some "remote", which tells git to call up the remote repository over the Internet-phone :-) 3 and ask it what branches and commits and such it has. Then your git has their git hand over any new goodies, which your git stores away in your repository, so that you have everything they do, plus of course anything of your own.4

Again, let's say that you are on branch devel and you run this kind of git fetch step. Let's say further that when you do that, branch devel looks like this:

... E - F - G   <-- devel

When your git contacts their git, it may find new commits that you did not have before, such as commit H, which points back to some earlier commit as its parent.

Perhaps H's parent commit is commit G. For this to happen, though, your colleague has to have had commit G already. Thus, assuming you made G, you need to have published (pushed) it, so that she got it, and made her H based on G.

But presumably you have not yet pushed G, and her commit H points back to F. That's the more general case, so let's draw it:

... - E - F - G   <-- devel
            \
              H   <-- (her/their idea of what "devel" looks like)

Since commit G is private to your repository, she, like everyone else, thinks the chain goes E - F - H. You must now do something to incorporate "your commit" together with "her commit".

The way that's the most accurate description of the work that occurred, is for you to make a new merge commit M:

... - E - F - G - M   <-- devel
            \   /
              H

This is what git merge will do, so it's what will happen with a plain git pull.

The annoying thing with being completely accurate, history-wise, is that it gets you these "meaningless merges".5 So what you can do is, instead, copy your old commit G to a new, slightly-different commit, G'. There will be two changes between G and G': (1) in the work-tree associated with G', you'll include her changes from H first; and (2) in G', you'll say the parent-commit is H rather than F. This will look like this—let's move your old G up to make the line go E - F - H:

              G       [no longer needed, hence abandoned]
            /
... - E - F - H - G'  <-- devel

This is a "rebase" operation: copy your existing commits, changing their work-directory contents as needed, tacking the new commits onto the appropriate place (H), and then making your branch-tip point to the last commit in the new copies.

This works even if you made a whole bunch of commits, G1 through G5 or whatever, it just takes more copying.

When you use git pull --rebase, git does this for you. First it uses fetch to bring over any new commits, and then—if there are some new commit(s)—it rebases your previous commits onto the new ones.6


So now we can get back to git stash. If you have not made any new commits of your own on devel, but have some work-in-progress, and you use git stash to save it, you get this:

... - E - F       <-- devel
          |\
          i-w

You now use git pull without --rebase, and it brings in commit H ("hers"—we're skipping over the letter G entirely, reserving it for now) and does a merge. Git does this as a "fast-forward merge" since you have no commits of your own, and you get this:

... - E - F - H   <-- devel
          |\
          i-w

You then git stash apply, which makes git look at the changes between commits F and w and merge them in to your working directory. That is, it applies your changes to the working directory for commit H. Once you also drop the stash (or if we just don't bother drawing it), git add your changes, and git commit, you get a new commit. For some reason :-) let's call it G' instead of G. So you now have:

... - E - F - H - G'  <-- devel

which looks exactly the same as if you'd committed first, then run git pull --rebase. In fact, the "abandoned" G commit in the earlier case is really the same commit as the (dropped, i.e., abandoned) stash-bag commit!7


But, what if you did already make some commit (or several, but we'll just use one) commit, G, before you git stash some more changes? Then you have this:

... - E - F - G     <-- devel
              |\
              i-w   <-- stash

Now you git pull (without --rebase) and pick up her commit H and merge it:

              H
            /   \
... - E - F - G - M    <-- devel
              |\
              i-w   <-- stash

Finally, you apply the stash, make sure it's all good, drop it, git add, and make a new commit N:

              H
            /   \
... - E - F - G - M - N   <-- devel

and you have one of those annoying "meaningless merges". It came in when you did the git pull without --rebase.


The short version, then, is that git stash only saves your bacon (avoids annoying merge) if you have no commits of your own. The git pull --rebase method is more general. (Although, the problematic "upstream rebase" case notwithstanding, I much prefer doing a separate git fetch step. Then I look over what came in, and choose whether to rebase or merge. But that's up to you.)


1Specifically, it makes at least two commits. First it makes one for the current index, i.e., what you'd get if you did git commit without any git add, git rm, etc, and forcing the commit to exist (a la --allow-empty) even if the tree is unchanged. Then it makes a multi-parent commit, i.e., a merge commit, with the current working directory as its contents. All these commits are done in a way that does not move the branch-tip. For additional details see this answer.

2I recommend using git stash apply, checking the result, and then using git stash drop if you're satisfied with the effect of apply. The pop command just means apply-then-drop, i.e., it assumes that you are satisfied. But if you use git stash a lot you may have multiple stashes, and you might accidentally apply the wrong one, or too many of them, or something. If you're in the habit of "apply first, get everything all set, and only then drop", I think you're likely to make fewer mistakes. Of course, people differ. :-)

3Unless the "remote" is really local, e.g., file://whatever, or a local path; or conceivably in the future there may be some non-"Internet" URLs. Git does not really care how it gets the new stuff from the remote, only that it can find out what the remote has, and bring that over so that it's now local.

4When you use git pull, it invokes fetch with some special limits turned on, so that you fetch only stuff that you intend to merge (or rebase-onto). In pre-1.8.4 versions of git this also inhibits updating your local "remote branch" entry, i.e., fetch fails to save a bit of useful information. As the release notes for git 1.8.4 put it:

this was an early design decision to keep the update of remote tracking branches predictable, but in practice it turns out that people find it more convenient to opportunistically update them whenever we have a chance, and we have been updating them when we run "git push" which already breaks the original "predictability" anyway.

5They do have some meaning (they mean you and she worked in parallel), but by next week if not earlier, nobody cares. That's just noise. This is generally true of all kinds of commits: if you try something, and then make a few more commits and have to back out the earlier "try something" as a total failure, the attempt plus the backing-out are probably just noise. If these commits are all private (non-published) you can use an interactive rebase to "edit history" to make it look like you never bothered doing the failed experiment. At the same time, the failed experiment might actually be useful information: "don't try it this way, that doesn't work". It's up to you to figure out what is Good Information, and what is Meaningless Noise.

6It's worth noting that git pull --rebase is extra-clever in the case of an "upstream history rewrite". Suppose, before you pull, you have this:

...-o-x-x-Y   <-- branch
         `------- origin/branch

where o and x represent "their" commits and Y is/are are "your" commits (they could be Y1-Y2-Y3 etc; it works out the same in the end). Suppose that when you run the git fetch step, it turns out "they" rebased branch themselves, so that instead of o-x-x as what's "on" origin/branch, you get o-*-*-*:

...-o-x-x-Y   <-- branch
     \   `------- old origin/branch
      *-*-*   <-- FETCH_HEAD, to become new origin/branch

It's obvious (well, this drawing should make it seem obvious...) which commits were rebased upstream: they're the ones spelled * instead of x. So it's also obvious (heh) that git can rebase the Y chain onto the tip * commit, as pointed to by FETCH_HEAD:

...-o-x-x-Y     [abandoned]
     \
      *-*-*-Y'  <-- tip
           `------- new origin/tip

If you use a "regular" fetch, rather than the one in git pull --rebase, this updates the remote branch, origin/tip, which obscures the "fork point" that's so easily identified here, at least up until origin/tip is moved to point to the new tip. Fortunately there is enough information in git's reflogs to reconstruct it, and in git 1.9/2.0, now that git fetch always updates the remote branches, there is a way to ask git to find the fork-point later, so that you can recover from upstream rebases more easily.

7More precisely, it has the same tree as commit w in the stash-bag.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top