Note: using an explicit "remote" is the way to go these days (see below for why). Naming a url directly is a very old (and pretty much obsolete) method.
If you were to run gitk --all FETCH_HEAD
you'd see something different (try it and see). The reason is that --all
only names all refs in refs/
(see below).
Remotes and refspecs
What's a "remote"?
A remote is, in concrete terms, an entry in the git config file (usually .git/config
within the repo itself). Or rather, a series of entries under a section, remote.name
:
[remote "origin"]
fetch = +refs/heads/*:refs/remotes/origin/*
url = ssh://some.host.name/path/to/repo.git
or similar. The point of this is to record some common items so that you don't have to repeat them all the time. In particular, the url (and optionally push-url) do not have to be spelled out after this. The fetch =
line is also important, as noted below. (It's different for a "mirror" than for a "regular" repository.)
If you run git fetch
with a repository argument instead of a URL ...
If you do git fetch ../second
as shown above, you're naming a repository directly, rather than a "remote". So you don't need a remote "origin"
section and all its entries, but instead, you may have to do more work / typing. You can name the other repository by a full url like ssh://...
or https://...
or whatever; for the special case of a repository already on your own machine, you can use a relative path name, as in your example.
I find it best to think about the refspec as identifying a "remote repository", probably on some other machine over a network. This helps keep clear in my mind who has access to what. The special case of a "remote repository" being on your own local machine is, well, a special case. Obviously if it's on your local machine, it's accessible at all times. Other remotes are often less accessible.
Consider the case of cloning, e.g., the source code to git itself from some web site (kernel.org or wherever), onto a laptop. At some point you unplug the laptop and take it with you—maybe onto a plane, where you won't have network access. So "they" give you access to their repository and you copy it to yours. Once you have everything, you don't need "theirs" except to occasionally re-synchronize with them.
git fetch
can take more than two arguments
If you run git fetch repository refspec
, fetch updates not only the objects in your local repository, but also some set of "ref-names" (references; see below). The last argument to fetch
is the "refspec" part, which (to ignore some technicalities) is basically a pair of ref-names, separated by colons. For instance, you might write git fetch ssh://... master:refs/remotes/origin/master
.
You need to specify which ref-names, if any, on the place you're fetching from, should have their objects brought over—but also, just as important, what name(s) those should be given in "your" repository. Sure, "they" have branch master
, but also branches maint
(maintenance), next
, and so on. Initially, you could give them the same branch names in your repository—but then after you've been working, and added stuff, and you re-synchronize with them, *their master
and your master
are different. So you need a different name under which to put "their branch" when you fetch their updates to their master
.
Running git fetch
with a remote name, like origin
, provides a refspec for you, via that fetch
line (in fact, there can be multiple fetch
lines, for multiple refspecs). But when you're not using a remote, you have to provide your own refspecs. You didn't, so you got a default (more about this in a moment).
References
References include things like branch and tag names. However, they're much more general and flexible than that. In fact, HEAD
is also a reference. References have a whole "name space" thing going on: they are almost all spelled starting with refs/
, and particular kinds of refs live in different parts of this space. The four you will use all the time are HEAD
(which is kind of special—it doesn't start with refs/
and git uses it internally all the time—but it is still a reference), branches (local branches), tags, and remote branches.
(In fact, the HEAD
reference name is so special that if you remove it, git decides that you no longer have a repository after all.)
Git will usually automatically choose the "right kind" of ref and not make you spell it all out, but it helps to know all this stuff, especially when git gets confused and its "figure it out and do what I mean" code does something you did not actually mean.
Local branches
Local branches live in refs/heads/
, so your local master
branch is actually the full name refs/heads/master
. When you create new branches, this just adds more refs/heads/
names. (Those wind up in files in your local repository. Creating a branch just needs to create a tiny 41-byte file. This is why branching is so fast and easy in git.)
Usually, you leave off the refs/heads/
part and just write your branch name. Git knows what to do.
Tags
Tags live in refs/tags/
: the tag v1.0
is just refs/tags/v1.0
. Using --tags
with git fetch
just tells it to add refs/tags/*:refs/tags/*
to the refspecs it will update. (In some versions of git this is a "replace" instead of "add".)
Usually, you leave off the refs/tags/
part and just write the tag name. Since you're running a command like git tag
or git fetch --tags
, git knows what to do.
Remote branches
Despite the name, "remote branches" are actually a local thing, kept in "your" repo. In other words, they come with you when you take the laptop on the plane.
Remote branches live in refs/remotes/
, and then one have more name-part that is just the name of the remote. For the origin
remote, for instance, you get refs/remotes/origin/master
to keep track of what was in master
on remote origin
. If origin
also has a branch named maint
, you can keep track of "what was in maint
over there" in your own, local, refs/remotes/origin/maint
.
Again, usually you leave out the refs/heads
part—but this time, you keep the remote-name. So you write things like origin/master
and origin/maint
.
One big reason for the extra name-part is that you can have more than one remote. If you have remotes origin
and fred
, you keep your copy of master
-on-origin in origin/master
, and you keep your copy of master
-on-fred in fred/master
. The other big reason for the extra name-part is that when you write origin/master
, git can tell that you mean the remote branch master
, not your local master
.
These "remote branches" are what git fetch
needs to update. But, in order to update them automatically, it needs to know the name of the remote. That's why git fetch remote
is "better": it just does all this automatically. You could write them out explicitly, with git fetch url "+refs/heads/*:refs/remotes/origin/*"
, but it sure is nicer to have it all saved away under remote "origin"
.
The obsolete way
Long ago, git did not have all this stuff. Instead, you ran git fetch url refname
, e.g., git fetch ssh://... master
.
To make this work, fetch
had to not clobber your master
. So what it did—and still does—is go to the remote repository and bring over all the repository-objects needed, drop them into your repository, and then write another "special" reference, FETCH_HEAD
. (Like HEAD
and MERGE_HEAD
and a few more special names, FETCH_HEAD
does not live under the refs/
space.)
This happens any time you write a refspec and leave out the colon. And, if you leave out the refspec entirely, that means the same as if you had written HEAD
. Thus:
git fetch url master
meansgit fetch url master:FETCH_HEAD
git fetch url maint
meansgit fetch url maint:FETCH_HEAD
git fetch url
meansgit fetch url HEAD:FETCH_HEAD
Note that the remote repository is a git repository ("well duh" :-) ). This means it has a HEAD
. If it's a typical repository for fetch
ing, its HEAD
is the same as its master
, so that the default you get is to fetch master
and write that into FETCH_HEAD
.
git pull
Git's pull
command is basically just a convenience method. It "means" the same thing as git fetch
followed by git merge
(or, with git pull --rebase
, git fetch
followed by git rebase
, but let's ignore that here).
It's a somewhat weird and (my opinion) broken convenience method, though. (Much is to be fixed in git 1.9.) When you run:
git pull origin master
for instance, what git pull
does is to invoke git fetch
"the old way", so that this brings over origin
's master
but fails to update refs/remotes/origin/master
. Instead, it just puts the stuff-brought-over reference into FETCH_HEAD
. There, it's invisible to most commands, including gitk --all
.
But the next thing git pull origin master
does is to run (in effect):
git merge FETCH_HEAD
This merges the changes into your current branch, which makes them visible to most commands, including gitk --all
.
In this particular case, it does not matter whether you run git pull remote branch
or git pull url branch
, or either of those without a branch
argument, as the pull
script prevents git fetch
from updating the remote-branch names.
(In git 1.9, a git pull
with a remote name, or a git pull
with no arguments that is able to compute the remote name, will run git fetch
such that it updates the remote-branch names.)