In the process of exploring the inner working of Git, I went in to the refs/remotes/origin directory of my git project and ran the ls
command. Here is what I see.
$ ls
HEAD sp2013dev
Then I ran cat HEAD
and here's what was printed.
$ cat HEAD
ref: refs/remotes/origin/master
However, there is no file or directory by the name of master in the refs/remotes/origin directory. This directory only has 'HEAD' and 'sp2013dev'
Am I missing something here ? Why is HEAD referring to something (btw, is 'ref' the correct terminology for this 'something' ?) which does not exist ?
Just for emphasis: you're poking around in the insides of git; you've found a directory named .git/refs/remotes/origin
and it contains two files, HEAD
and sp2013dev
. The contents of .git/refs/remotes/origin/HEAD
are ref: refs/remotes/origin/master
.
A reference that consists of the literal string ref:
followed by another reference name is a "symbolic" reference, in git terms. It means "although this is a valid name, the SHA-1 to which this reference resolves is found by reading another reference." But, as you note, there's no file named master
in the .git/refs/remotes/origin
directory, so you are wondering how this can work.
The answer is that not all references are necessarily in files. References can be, and are, "packed". Currently, packed references are found in .git/packed-refs
, which is a plain-text file. Your own Git will have a refs/remotes/origin/master
paired with an SHA-1 hash in your .git/packed-refs
.
Note: a ref may appear in both the packed-refs
file and a file in a .git/refs
sub-directory. In this case, the second version overrides the first. This allows git pack-refs
(invoked by git gc
) to pack all refs, and then let refs become "unpacked" as needed, when they're updated. (But this is an implementation detail, and you're not supposed to assume this; in scripts, use git update-ref
and git symbolic-ref
to read and update references, and let those programs enforce the updating rules.)
Currently, it appears that there is no packed format for symbolic references, so those all live in "real files" for now.
When you clone another repository—or, for that matter, any time you run git fetch
or git push
, although those do not deal with refs/remotes/origin/HEAD
—there are two repositories involved, with two sets of rules being enforced by two different Gits controlling those two repositories. Aside from the command git remote set-head
, it's only git clone
that creates the origin/HEAD
in the first place, so we can concentrate on git clone
itself.
Since there are two repositories, let's give them names. We can call your local repository L and the repository at origin
, which you're in the process of cloning, O. Since O is a Git repository, it has its own HEAD
. This HEAD
is normally a symbolic reference: HEAD -> master
for instance. Whatever branch it names, though, is local to O: it's refs/heads/branch
, which is a local branch on O. If you were to log in on the machine hosting repository O, and view their .git/HEAD
file, it would contain ref: refs/heads/branch
. The Git on O is enforcing this rule, that HEAD
always names a local (local-to-O that is) branch.
Now, your own Git, running locally on your system, is working to create repository L. Your Git on L wants to create, on L
, a remote-tracking branch whose full name is refs/remotes/origin/HEAD
. This is going to be a symbolic ref, and it is going to point to refs/remotes/origin/branch
. But there are two complications:
HEAD
. In our case it's (their) refs/heads/branch
.refs/remotes/origin/branch
as a remote-tracking branch. (Your Git will store this in your .git/packed-refs
file.) Your Git then creates L's refs/remotes/origin/HEAD
as a symbolic reference to refs/remotes/origin/branch
.Something can go wrong at either of these two points. Step 2 fails if you use git clone -b otherbranch --single-branch
, for instance. You are telling your Git that L should not have refs/remotes/origin/branch
at all, but only refs/remotes/origin/otherbranch
. But if step 2 fails like this, your Git simply does not create refs/remotes/origin/HEAD
at all. That way you won't have, in L, a refs/remotes/origin/HEAD
pointing to the non-existent remote-tracking branch.
Step 1 "fails" (in a way) if either your Git, building L, or their Git, serving O to your Git, is too old. There is a defined protocol for looking up reference names during these Internet-connection "phone calls" that copy commits (git fetch
, git push
) and view branch and tag names (git ls-remote
) and so on. The clone
command uses the same defined protocol. Early in the connection, your Git and their Git negotiate the protocol options to use. Older Gits have no option for expressing and resolving a symbolic reference, so if either Git does not support the option, O just claims that O's HEAD
is some particular SHA-1 ID. The Git building the clone L has to guess which branch is actually HEAD
on O. It does so by scanning through the rest of the branch names it gets for O. If O's HEAD
is 1234567
, and there's one branch that is also 1234567
, well, that must be where their HEAD
points!
This can be wrong, i.e., your Git may guess wrong. But if so, your Git just proceeds with the wrong assumption and goes on to worry about step 2.
Step 1 can fail quite differently if their (O's) HEAD
is detached. In this case, there may be no matching branch. If so, your Git won't guess wrong, it will just know that there's no branch checked out in repository O. If both your Git and their Git are not too old, your Git will negotiate the right protocol option and your Git will know for sure whether O has a detached HEAD, and if not, what branch is checked out in O.
It's easy to make step 2 fail deliberately: just clone some repository whose HEAD
you know refers to some particular branch (such as master
) while using both --single-branch
and -b
to make your clone L avoid that branch. It's a bit harder to make step 1 fail deliberately, but you can do it by logging in on the system that exports O and detaching O's HEAD.
If you do this, though, then—back on your own system—use git clone
to make L, you simply find that your clone L has no origin/HEAD
. This maintains the normal rules, which include these three (I'm not sure off-hand if there are more):
HEAD
, which is normally (but not always) symbolic, contains only a reference to a local branch, i.e., a name in the refs/heads/
name-space.The special case exception is that the special reference name HEAD
may contain the name of a local branch that does not actually exist yet. This (called, variously, an "unborn branch" or an "orphan branch") is how master
comes into being in a new, empty repository: HEAD
points to master
even though master
does not yet exist. A commit you make when in this state causes the branch to begin existing, pointing to the commit just made; and the commit just made has no parents, i.e., is a root commit.
You might think you could break these rules using git remote set-head
, but in fact, it won't let you:
Use
<branch>
to set the symbolic-refrefs/remotes/<name>/HEAD
explicitly. e.g., "git remote set-head origin master" will set the symbolic-refrefs/remotes/origin/HEAD
torefs/remotes/origin/master
. This will only work ifrefs/remotes/origin/master
already exists; if not it must be fetched first.
(emphasis mine). For more about git remote set-head
, see the git remote
documentation.
You can break these rules using either direct access to the .git
directory, or with git symbolic-ref
. Obviously, if you poke around inside .git
, you can mess things up pretty good, :-) so that requires care. The sharp edges with git symbolic-ref
are a bit more surprising, but now you know that you must be careful with that, too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With