I know that git pull
is actually a combination of git fetch
and git merge
, and that it basically brings in the repository as it is in the remote repository.
git pull
my working tree will be identical to the remote repo?git pull
doesn't change anything in my local repo or create any new commit. What is the explanation for this?git pull
makes changes at the index only?The exactly part is really quite tough. It's often said—and it's mostly true—that git pull
runs git fetch
followed by either git merge
or git rebase
, and in fact, git pull
, which used to be a shell script and is now a C program, quite literally ran git fetch
first, though now it directly invokes the C code that implements git fetch
.
The next step, however, is quite tricky. Also, in a comment, you added this:
[fetch] brings changes from the remote repo. Where does it put them?
To understand this properly, you must understand Git's object system.
git fetch
Each commit is a sort of standalone entity. Every commit has a unique hash ID: b06d364...
or whatever. That hash ID is a cryptographic checksum of the contents of that commit. Consider, for instance:
$ git cat-file -p HEAD | sed 's/@/ /g'
tree a15b54eb544033f8c1ad04dd0a5278a59cc36cc9
parent 951ea7656ebb3f30e6c5e941e625a1318ac58298
author Junio C Hamano <gitster pobox.com> 1494339962 +0900
committer Junio C Hamano <gitster pobox.com> 1494339962 +0900
Git 2.13
Signed-off-by: Junio C Hamano <gitster pobox.com>
If you feed these contents (minus the 's/@/ /'
part but with the header that Git adds to every object) to a SHA-1 checksum calculator, you will get the hash ID. This means that everyone who has this commit has the same hash ID for it.
You can get the Git repository for Git and run git cat-file -p v2.13.0^{commit}
to see this same data. Note: the tag v2.13.0
translates to 074ffb61b4b507b3bde7dcf6006e5660a0430860
, which is a tag object; the tag object itself refers to the commit b06d364...
:
$ git cat-file -p v2.13.0
object b06d3643105c8758ed019125a4399cb7efdcce2c
type commit
tag v2.13.0
[snip]
To work with a commit, Git must store the commit object—the item with the hash ID b06d364...
—itself somewhere, and also its tree
object and any additional objects that tree
needs. These are the objects
that you see Git counting and compressing during a git fetch
or git push
.
The parent
line tells which commit (or, for a merge, commits, plural) are the predecessors of this particular commit. To have a complete set of commits, Git must also have the parent commit(s) (a --shallow
clone can deliberately omit various parents, whose IDs are recorded in a special file of "shallow grafts", but a normal clone will always have everything).
There are four types of object in total: commits, (annotated) tags, trees, and what Git calls blob objects. Blobs mostly store the actual files. All of these objects reside in Git's object database. Git can then retrieve them easily by hash ID: git cat-file -p <hash>
, for instance, displays them in a vaguely human-readable format. (Most of the time there is little that must be done other than de-compressing, though tree objects have binary data that must be formatted first.)
When you run git fetch
—or have git pull
run it for you—your Git obtains the hash IDs of some initial objects from another Git, then uses the Git transfer protocols to figure out what additional objects are required to complete your Git repository. If you already have some object, you do not need to fetch it again, and if that object is a commit object, you do not need any of its parents either.1 So you get only the commits (and trees and blobs) that you do not already have. Your Git then stuffs these into your repository's object database.
Once the objects are safely saved away, your Git records the hash IDs in the special FETCH_HEAD
file. If your Git is at least 1.8.4, it will also update any corresponding remote-tracking branch names at this time: e.g., it may update your origin/master
.
(If you run git fetch
manually, your Git obeys all the normal refspec update rules, as described in the git fetch
documentation. It's the additional arguments passed to git fetch
by git pull
that inhibit some of these, depending on your Git version.)
That, then, is the answer to what I think is your real first question: git fetch
stores these objects in Git's object database, where they may be retrieved by their hash IDs. It adds the hash IDs to .git/FETCH_HEAD
(always), and often also updates some of your references—tag names in refs/tags/
, and remote-tracking branch names in refs/remotes/
.
1Except, that is, to "unshallow" a shallow clone.
git pull
Running git fetch
gets you objects, but does nothing to incorporate those objects into any of your work. If you wish to use the fetched commits or other data, you need a second step.
The two main actions you can do here are git merge
or git rebase
. The best way to understand them is to read about them elsewhere (other SO postings, other documentation, and so on). Both are, however, complicated commands—and there is one special case for git pull
that is not covered by those two: in particular, you can git pull
into a non-existent branch. You have a non-existent branch (which Git also calls an orphan branch or an unborn branch) in two cases:
git checkout --orphan newbranch
In both cases, there is no current commit so there is nothing to rebase or merge. However, the index and/or work-tree are not necessarily empty! They are initially empty in a new, empty repository, but by the time you run git pull
you could have created files and copied them into the index.
This kind of git pull
has traditionally been buggy, so be careful: versions of Git before 1.8-ish will sometimes destroy uncommitted work. I think it's best to avoid git pull
entirely here: just run git fetch
yourself, and then figure out what you want to do. As far as I know, it's OK in modern Git—these versions will not destroy your index and work-tree—but I am in the habit of avoiding git pull
myself.
In any case, even if you are not on an orphan/unborn/non-existent branch, it's not a great idea to try to run git merge
with a dirty index and/or work-tree ("uncommitted work"). The git rebase
command now has an automatic-stash option (rebase.autoStash
), so you can have Git automatically run git stash save
to create some off-branch commits out of any such uncommitted work. Then the rebase itself can run, after which Git can automatically apply and drop the stash.
The git merge
command does not have this automatic option, but of course you can do it manually.
Note that none of this works if you are in the middle of a conflicted merge. In this state, the index has extra entries: you cannot commit these until you resolve the conflicts, and you cannot even stash them (which follows naturally from the fact that git stash
really makes commits). You can run git fetch
, at any time, since that just adds new objects to the object database; but you cannot merge or rebase when the index is in this state.
- But still, does it mean that after "git pull" my working tree will be identical to the remote repo?
Not necessarily. Any local commits you have on the branch you're pulling will be merged with the changes upstream. Use git pull --rebase
to put your local changes on top of the upstream commits. You can get some pretty funky merge paths without --rebase
.
- I found some cases that doing "git pull" doesn't change anything in my local repo or create any new commit?
If there's no new commits upstream, nothing will change in your local copy either.
- Does it make sense that "git pull" makes changes at the index only?
Not that I know of. Perhaps if it fails to merge with your local commits, but then you should at least get some errors along the way.
- If it does, how can I make the changes at index move forward to the work tree?
git pull
:) Or git rebase <upstream> <branchname>
. This will rebase the local commits in your branch <branchname>
on top of the upstream commits in that branch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With