Git blobless repository

Question 1

Today, git has "partial clone" options that enable downloading the commits and trees of a repository without its blobs. You can do it by passing --filter=blob:none to the git clone command. This does require the remote you are cloning from to have a new enough git version to support the filter protocol.

See also:

Question 2

Technically, a commit object only names a tree object, and then the tree object (once found) names more trees and blobs. Thus, a git repository in which all the blob object files were deliberately "broken" (e.g., overwritten with an empty file, or even removed entirely) would work to some degree—in fact, to the same degree that it does if you create such a thing manually:

$ chmod +w .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb
$ : > .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb
$ git status
# On branch master
nothing to commit, working directory clean
$ git cat-file -p HEAD:file
error: object file .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb is empty
fatal: Not a valid object name HEAD:file
$ git fsck
Checking object directories: 100% (256/256), done.
error: object file .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb is empty
error: sha1 mismatch f70d6b139823ab30278db23bb547c61e0d4444fb
error: f70d6b139823ab30278db23bb547c61e0d4444fb: object corrupt or missing
missing blob f70d6b139823ab30278db23bb547c61e0d4444fb

Clearly it sort-of-works. (In fact, git cat-file -p HEAD and git cat-file -p HEAD: also work here, as does git ls-tree -r HEAD.)

The problem you're going to run into immediately is that git prefers to store objects in packs, and transfer packs around, and those will notice the corrupted (or missing, if you rm them) objects. It might not even save that much space, depending on how compressed the objects are in the packs (it's been observed that the repo is sometimes smaller than the checked-out tree!).