Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git: How does git remember the index for each branch?

Tags:

git

git-index

For example, I create file a in the repo(suppose I'm on the master branch), then I git add a and git commit. After that I git branch copy and git checkout copy. Finaly I create file b in the word directory then git add b.

Git seems to be smart when I checkout back to the master branch, and git ls-files, file b is not listed.

So I'm confused, since we only have one index file in a repo, how can git maintain different staging area for branches at the same time?

EDIT:

How to explain the files which are staged but not commited, is still remembered per branch?

like image 836
Determinant Avatar asked Aug 23 '12 08:08

Determinant


People also ask

How does Git keep track of branches?

Git stores all references under the . git/refs folder and branches are stored in the directory . git/refs/heads. Since branch is a simple text file we can just create a file with the contents of a commit hash.

Where are git indexes stored?

Git index is a binary file (generally kept in . git/index ) containing a sorted list of path names, each with permissions and the SHA1 of a blob object; git ls-files can show you the contents of the index.

How does Git internally manage branches?

A branch in Git is simply a lightweight movable pointer to one of these commits. The default branch name in Git is master . As you start making commits, you're given a master branch that points to the last commit you made. Every time you commit, the master branch pointer moves forward automatically.

What information does the index file stored in git?

As we mentioned earlier index is not a directory but a file, so git is not actually storing objects (blobs) into it. Instead, git is storing information about each file in our repository: mtime — is the time of the last update. file — name of the file.


2 Answers

I've not dived into the implementation in detail but when you switch a branch, the index file is manually updated to reflect the content of the new HEAD.

For example, I have to branches here master (with one file) and test (with two files).

noufal@sanitarium% git branch
  master
* test
noufal@sanitarium% file .git/index
.git/index: Git index, version 2, 2 entries
noufal@sanitarium% git checkout master
Switched to branch 'master'
noufal@sanitarium% file .git/index
.git/index: Git index, version 2, 1 entries

It's changed the index when the branch switching happened.

Also, if you "manually" switching branches, git doesn't update the index and gets confused. Continuing from above.

noufal@sanitarium% more .git/HEAD
ref: refs/heads/master
noufal@sanitarium% echo "ref: refs/heads/test" > .git/HEAD
noufal@sanitarium% file .git/index
.git/index: Git index, version 2, 1 entries
noufal@sanitarium% git status
# On branch test
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       deleted:    b
#

In other words, the index has a missing file which is there in the current repository so it's "staged for delete".

As for switching branches after staging, the index is a separate area which doesn't change.

noufal@sanitarium% git branch
* master
  test
noufal@sanitarium% ls
x
noufal@sanitarium% git status
# On branch master 
nothing to commit (working directory clean)
noufal@sanitarium% git checkout test
Switched to branch 'test'
noufal@sanitarium% ls
x
noufal@sanitarium% echo "Something" > b
noufal@sanitarium% git add b
noufal@sanitarium% git status
# On branch test   
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   b
#
noufal@sanitarium% git checkout master
A       b
Switched to branch 'master'
noufal@sanitarium% git status                    # Also there in index on master branch.
# On branch master 
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   b
#
noufal@sanitarium% git commit -m "Added b in master"
[master 41d0c68] Added b in master
 1 file changed, 1 insertion(+)
 create mode 100644 b
noufal@sanitarium% git status
# On branch master 
nothing to commit (working directory clean)
noufal@sanitarium% git checkout test
Switched to branch 'test'
noufal@sanitarium% ls                           # Missing in the test branch although it was `git add`ed here. 
x
noufal@sanitarium%        
like image 128
Noufal Ibrahim Avatar answered Sep 22 '22 01:09

Noufal Ibrahim


In order to understand this, you need to digg a little deeper into git internals.

Git stores all kinds of information as objects. There are mainly three kinds of objects.

  • blob

    Store acctural file content in git.

  • tree

    Store the information of a tree structure, may contain references to other blob objects and tree objects.

  • commit

    Store the information about a commit, contain a reference to a tree object and other informaion, eg, author, committer, committing message and so on.

An index file is accturally a tree object which presents the information about the current working tree.

Each object is identified by a unique sha1 hash of its content. Under .git/refs or in .git/packed_refs, git holds the relationship between the branch and the sha1 hash of the commit object it points to.

Each time you checkout a new branch, git just extract the files according to the tree object associated with the commit of that branch and generate the new index file.

Git Internals could help.

like image 40
weynhamz Avatar answered Sep 20 '22 01:09

weynhamz