For git revert
to "back out" a change, it needs to figure out what the change was.
In the case of most ordinary commits, the change is easy to compute. Consider for instance this git commit graph fragment:
... - G - H ... <-- HEAD=master
Here you're on branch master
which has commits G
, H
, and then some more.
If you ask git to revert commit H
, it simply needs to see what changed between "everything in revision G
" and "everything in revision H
". Git can do this the same way you can, by comparing G
and H
:
$ git diff <sha1-of-G> <sha1-of-H>
If this says that in commit H
, you added one line to file readme.txt
and removed file x.h
entirely, then git can undo this by removing that one line from readme.txt
and restoring file x.h
from commit G
.
Merge commits are more complex, though. Let's fill in some more of that commit graph:
I - J
/ \
... - G - H M - N - O <-- HEAD=master
\ /
K - L
If you ask git to revert the merge commit M
, what changes should it back out?
There's one set of changes in going from J
to M
:
$ git diff <sha1-of-J> <sha1-of-M>
(these changes are, in fact, the changes brought in via commit L
as compared to commit H
, which will be the changes from commits K
and L
combined).
There's another, likely quite different, set of changes going from L
to M
:
$ git diff <sha1-of-L> <sha1-of-M>
(these changes are actually those in I
and J
, by similar logic).
You must tell git which set of changes to undo, and which to keep. Git has you do this by specifying the "main line". This also relies on the fact that the parent IDs stored in a merge commit are in a specific order.
Let's say you were on commit J
, which was master
, when you made the merge:
$ git checkout master # i.e., commit J
$ git merge branch # i.e., commit L
Now you are on commit O
, which is still master
. The branch name branch
may no longer exist (or might point to some commit other than L
), but you want to discard the changes in both K
and L
—i.e., the ones that were on branch branch
when you did the merge.
The first parent of M
is J
, because you merged branch
into master
, which git records by making J
the first parent and L
the second parent. Thus, to discard the changes from commits K
and L
, you could now use:
$ git revert -m 1 HEAD~2
(here HEAD~2
backs up two commits, from O
to N
and then to M
). Git can then diff M^1
(J
) against M
, which finds the changes introduced by merging in branch branch
as noted above; and then reversing those changes results in backing out the changes introduced by the merge.
Note that this makes a new commit, resulting in a graph that looks like this:
I - J
/ \
... - G - H M - N - O - P <-- HEAD=master
\ /
K - L
where comparing commits O
and P
produces essentially the same thing as comparing M
and J
(in that order, i.e., the reverse of the "normal" compare from J
to M
). As far as later operations in git are concerned, though, you might as well have done this by hand-editing the tree for O
and making the new commit P
: it does not record (except in the commit message text) that P
is essentially a revert of both K
and L
.
Incidentally, it's worth noting that in this particular case, you could simply revert L
first, then revert K
, to (probably) get the same effect (with two separate extra commits):
$ L=$(git rev-parse HEAD~2^2) # get sha-1 ID of commit L
$ git revert $L # make new commit P that reverts L
$ git revert $L^ # make new commit Q that reverts L^ = K
With a big merge, though, reverting each individual change is a lot of work; reverting the merge commit itself is much easier (both to do, and to understand later, if properly documented). (Also, the "probably" above is because the merge handles identical changes made on "both sides" of the branch, and reverting the merge avoids undoing changes in K
and L
that were not brought forward into M
because they also occurred in I
and/or J
. However, this is somewhat rare, especially in tiny branching structures like this.)