Question

As I understand it, Git hashes files like this (Lua example, I have a function called sha1 that calculates, well, sha1 hashes):

sha1("blob "..filesize.."\0"..content)

My question is, how does Git combine these individual hashes into one? Specifically, I want to be able to calculate the hash of the latest commit on a Git repo (on GitHub) to verify that the local copy of the repo is identical to the alone on GitHub, while still allowing people to modify the code.

Does Git concatenate the hashes and then hash again, or use some other trickery? From what I understand, the "latest commit" hash is just a hash of the repo's content, so I can see if my files match. Is this true?

Was it helpful?

Solution

blobs are blobs, trees refer to blobs and trees and commits by embedding the hash codes of the contained blobs and trees and commits so the tree's hash is a hash-of-hashes, and a commit embeds its (toplevel) tree's hash along with any parent commit hashes etc.

Fetch zlib and build its zpipe example, that will let you dump the content of each repo object directly (or git cat-file will do the same but without the type/length/nul header)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top