Question

For a git log viewer that I'm building, I was just viewing git log output and wondered about three things. As there's no documentation about the exact output in the Git SCM book and similar resource available, I had to ask on SO:

  1. What's the index 1234567..1234567 123456? It doesn't match the commits sha?
  2. What is the int after the comma for a changed line? @@ -40,20 +40,20 @@?
  3. What is the part after the second @@ for a changed line?

Example, taken from Git SCM Book "Viewing the commit history"

$ git log -p -2
commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon <schacon@gee-mail.com>
Date:   Mon Mar 17 21:52:11 2008 -0700

    changed the version number

diff --git a/Rakefile b/Rakefile
index a874b73..8f94139 100644
--- a/Rakefile
+++ b/Rakefile
@@ -5,5 +5,5 @@ require 'rake/gempackagetask'
 spec = Gem::Specification.new do |s|
     s.name      =   "simplegit"
-    s.version   =   "0.1.0"
+    s.version   =   "0.1.1"
     s.author    =   "Scott Chacon"
     s.email     =   "schacon@gee-mail.com
Was it helpful?

Solution

1. What's the index 1234567..1234567 123456? It doesn't match the commits sha?

From the git-log man page:

The index line includes the SHA-1 checksum before and after the change. The < mode > is included if the file mode does not change; otherwise, separate lines indicate the old and the new mode.

Okay, cool... but what does that mean?

So, git works by internally storing objects using addresses. These SHA-1 checksums are the addresses for the file-versions before and after your change.

You can see that by using a git-internal command to access those files:

$git cat-file -p a874b73
<--- the contents of the file before the commit --->

$git cat-file -p 8f94139
<--- the contents of the file after the commit --->

The second number, the <mode>, indicates the unix file permissions. See this answer for more info about how to read them.

See below for an example explaining the hashes!

EDIT: Whoops, forgot to address the second and third questions.

2. What is the int after the comma for a changed line? @@ -40,20 +40,20 @@?

The changes are displayed using unified diff format. That wikipedia article has a pretty good explanation, but basically that line is the range information. It consists of two pairs: the first pair has a - sign and the second pair a + sign. The - pair refers to the original file, and the + pair to the second file. In each pair, the first number is the starting line number of the chunk about to be displayed, and the second number is how many lines will be displayed.

So @@ -10,5 +10,10 @@

Would mean that you had added 5 lines and so the chunk is five lines longer in the second version.

3. What is the part after the second @@ for a changed line?

This line is supposed to be the context that the chunk is in. So, as @TimWolla's answer below points out, if this were a C/C++ program, that might be the name of the function that this chunk is inside. In this case, the line require 'rake/gempackagetask' is ruby that is probably the first line of the program, so diff thought it was an appropriate name to refer to this section by.


Here are a couple examples of git's SHA-1 checksum at work:

$ touch newfileA newfileB
$ git add newfile*
$ git commit -m 'added new files'
[master a49cb1c] added new files
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 newfileA
 create mode 100644 newfileB

$ git log -p -1
commit a49cb1c5292082a6ed9c7f09e1bce2636e60ab93
Author: Nathan Daly <NHDaly@gmail.com>
Date:   Sat Mar 1 16:10:49 2014 -0500

    added new files

diff --git a/newfileA b/newfileA
new file mode 100644
index 0000000..e69de29
diff --git a/newfileB b/newfileB
new file mode 100644
index 0000000..e69de29

You can see that before the commit the files didn't exist (because there is no address referring to a previous version: thus index 0000000), and after they were created, the blobs for those files have the address e69de29. They share the same hash, because the files are identical, so there's no reason to have different copies of an empty file.

You can see this by just hashing an empty file. We get the same hash as before (this time in its full length):

$ touch blankfile
$ git hash-object blankfile
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391

Now, if we change the contents of the files:

$ echo 'contents1' > newfileA
$ echo 'contents2' > newfileB
$ git add newfile*
$ git commit -m 'updated newfile contents'    
[master 76827d7] updated newfile contents
 2 files changed, 2 insertions(+)

$ git log -p -1
commit 76827d7af1846c6c0f07ac2b78771cbc34cd6056
Author: Nathan Daly <NHDaly@gmail.com>
Date:   Sat Mar 1 16:18:40 2014 -0500

    updated newfile contents

diff --git a/newfileA b/newfileA
index e69de29..a024003 100644
--- a/newfileA
+++ b/newfileA
@@ -0,0 +1 @@
+contents1
diff --git a/newfileB b/newfileB
index e69de29..6b46faa 100644
--- a/newfileB
+++ b/newfileB
@@ -0,0 +1 @@
+contents2

You can see that they now have different hashes. Each of them started from the object at e69de29 and moved to a new object with a different address. And, to verify that, we can get the contents of those objects from their respective hashes:

$ git cat-file -p a024003
contents1
$ git cat-file -p 6b46faa
contents2

Finally, if we made their contents equal again, they would once again share a hash. (Again, git does this just to save disk space, since the two files are identical but for name)

$ echo 'contents2' > newfileA
$ git add newfileA
$ git commit -m 'now files match'
$ git log -p -1
commit 3f63bcef9290fae616521ec1b380639c6026c5c5
Author: Nathan Daly <NHDaly@gmail.com>
Date:   Sat Mar 1 16:22:29 2014 -0500

    now files match

diff --git a/newfileA b/newfileA
index a024003..6b46faa 100644
--- a/newfileA
+++ b/newfileA
@@ -1 +1 @@
-contents1
+contents2

As you can see, now newfileA also has the hash 6b46faa, just like newfileB.

Each object in git is stored with such a hash address, so in your git log viewer application, you can use those index hashes to show the user the contents of the file at each of those versions!

OTHER TIPS

To answer the questions not handled by NHDaly's answer:

What is the int after the comma for a changed line? @@ -40,20 +40,20 @@?

It is the length (the lines without a + or - plus the number of lines with a - for the first and the number of lines with a + for the second) of the block that is following. In this case: The number of lines displayed are 20.

What is the part after the second @@ for a changed line?

diff uses a algorithm that determines the context, it works best for C code, where it should show the method header, see for example Commit af87d2fe95 of the linux kernel.

As an addition to the other (excellent) answers, here're some more in-depth examples about the added/deleted lines as the unified diff format implies no comma value to be 1 (line added/deleted). This means that no comma value doesn't mean null, but 1 and zero/0 doesn't mean null, but 1. This might confuse some programmers like me.

Examples

The following examples imply that you're calling git show --unified=0 $sha and don't show unchanged lines. Else unchanged lines will be included as TimWolla pointed out in the comments (thanks).

@@ -10,5 +10,10 @@

Means that there were 5 deleted lines and 10 lines added.

@@ -10,0 +10,2 @@

No lines were deleted, but 2 lines were added.

@@ -10 +10,2 @@

One line was deleted and 2 lines added.

@@ -10 +10 @@

One line was deleted and 1 line was added.

@@ -10,7 +10 @@

7 lines were deleted and 1 line was added.

@@ -10,7 +10,0 @@

7 lines were deleted and no lines were added.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top