Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git commit commits staged and unstaged file

Tags:

git

TL;DR: When one file has staged and unstaged changes, a commit will commit both versions, with the most recent changes to the file. Why? I thought a commit would only commit the staged version, as noted here: https://stackoverflow.com/a/19892657/279516.

Let's say we have one file in our working directory. (It has previously been committed to the repo.)

$ ls
foo.txt

The file's contents are currently just one character (the number 1):

$ cat foo.txt
1

Let's change the contents of the file to be "12". Now we have this:

$ cat foo.txt
12

Our working directory shows the change (instructional git output removed for brevity):

$ git status
modified:   foo.txt

Now a git add will add that file to the staging index.

$ git add foo.txt

You can't see it here, but the file name is now green, to indicate it has been staged:

$ git status
modified:   foo.txt

At this point, we could commit the file, and it would become part of our local repository. However, let's change foo.txt first and see what happens.

$ cat foo.txt
123

And if we check git status we see two versions of foo.txt:

$ git status
On branch master
Changes to be committed:

        modified:   foo.txt

Changes not staged for commit:

        modified:   foo.txt

The first foo.txt is green. The second is red. The first one has as its contents "12". The second one has "123". What happens if we commit now? Only the staged foo.txt should be committed. So the foo.txt with "12" as its body will be in our local repo. The other version of foo.txt is still just in our working directory.

Commit foo.txt to our local repo, AS IT WAS WHEN WE ADDED IT:

$ git commit -m "Added 2 to foo." foo.txt

However, that's not what happened. Our working directory has no changes now. Both versions were committed. Why?

$ git status
On branch master
nothing to commit, working tree clean
like image 888
Bob Horn Avatar asked Dec 04 '18 15:12

Bob Horn


2 Answers

Besides the existing (correct) answers, it's worth noting that when using git commit [flags] file1 file2 ... fileN, you can put in the flags --only or --include:

  • --only is the default, and it means disregard what I staged so far, just make a commit with these files added.

  • --include means to what I staged so far, add these files too.

That's simple enough, but subtly wrong, because --only has to take an after-commit action as well.

A proper understanding of requires knowing what Git's index is and how git commit really, actually, commits what's in the index, not what's in the work-tree.

The way the Git programmers envision you using Git

The index is a fairly complicated thing, but most of it boils down to this: The index holds the set of files that will go into your next commit. That is, if you run git commit right now—without listing any files—the new commit will be a snapshot of all the files that are in the index right now, saving the contents that are in the index right now.

What this means is that right after:

$ git clone <some-url>
$ cd <repository>
$ git checkout <branch>

you have, in your index, all the same files, with the same content, that you see in your work-tree.

That is, each commit is a complete snapshot of all the files in that commit, frozen forever, in a special, compressed, Git-only form. This means you can always get back all of those files in their original form, by having Git extract and de-compress them, and that's what git checkout does: it finds the latest commit on the branch, and extracts and decompresses the files. (This is oversimplified: git checkout is really quite fancy. But that's the most basic thing it does.)

The useful-form files are in your work-tree where you can see them and work on them. That's good, because the committed ones are frozen and Git-only, which obviously would be a problem. :-)

But, to make a new commit, rather than re-compressing all the work-tree files—which would take a long time in a lot of cases—what Git does is save the (unfrozen but still compressed and Git-only) files in this thing called, variously, the index, the staging area, or the cache. So if you have README.txt and main.py in your work-tree, you also have them—in Git-only form—in your current commit (where they're frozen) and in your index (where they're thawed, but still Git-only).

Running git add README.txt tells Git to overwrite the index copy with the work-tree copy: take the ordinary format file and re-compress it to the Git-only format, replacing the README.txt in the index with the one from the work-tree. It's not frozen yet—you can overwrite it again before you commit—but it's ready now.

Running git commit, without specifying any files, tells Git: Package up whatever is in the index right now and make a new commit from that. This goes very fast since the files are already in the right form: Git just needs to freeze them into a new commit. (Of course, until you make the files in the index different from the ones in the current commit, there's no point.)

Note that after git commit makes a new commit from the index, you're usually back to the normal situation where the index and the current commit match. If you git add-ed all the files you changed, all three copies of each file—HEAD, index, and work-tree—match.

Introducing --only and --include

Running git commit with some files listed is a bit different, and this is where the --only vs --include come in. If you use --include, Git essentially does git add on the listed files. It's just shorthand for git add <those files> && git commit, more or less.

But, if you use --only—or don't specify, which means --only—what Git does is to shove the regular index out of the way and instead, make a new temporary index from whatever is in the frozen commit. To this new temporary index, Git will git add each of the files you listed. Then Git will make a new commit from this other index. But now there's a problem, because Git now needs to go back to the normal index, and here's where it gets a bit tricky.

Having made a new commit from the temporary index, Git now needs to update the real index the same way. In essence, after committing from the temporary index, Git re-adds the same set of files you listed to the real index, so that they match again.

Let's use the two-file example, with README.txt and main.py, again. This time, let's add a version number after each file. It's not part of the name of the file, it's just to help us remember:

    HEAD           index          work-tree
-------------   -------------   -------------
README.txt(1)   README.txt(1)   README.txt(1)
main.py(1)      main.py(1)      main.py(1)

They start out with all three versions of each file being the same. (Note that you can't change the HEAD copy. You can only make a new commit, which then becomes the HEAD copy, because HEAD names the new commit.)

Now you edit both files in the work-tree:

    HEAD           index          work-tree
-------------   -------------   -------------
README.txt(1)   README.txt(1)   README.txt(2)
main.py(1)      main.py(1)      main.py(2)

Let's say you now do git add main.py to copy the work-tree version into the index:

    HEAD           index          work-tree
-------------   -------------   -------------
README.txt(1)   README.txt(1)   README.txt(2)
main.py(1)      main.py(2)      main.py(2)

If you run a plain git commit right now, the new HEAD would have the old README.txt, because the index has the old README.txt. But instead, let's run git commit --only README.txt. This makes a temporary index, so that we have:

    HEAD         temp-index       work-tree
-------------   -------------   -------------
README.txt(1)   README.txt(2)   README.txt(2)
main.py(1)      main.py(1)      main.py(2)

Next, this makes a new commit from the temporary index:

    HEAD         temp-index       work-tree
-------------   -------------   -------------
README.txt(2)   README.txt(2)   README.txt(2)
main.py(1)      main.py(1)      main.py(2)

Meanwhile, the real index is not yet changed. Scroll up a bit and look at it: which version of main.py is in it? Which version of README.txt is in it?

If Git merely switched back to the real index now, while keeping the commit you just made, this is what you'd have:

    HEAD         ugly-index       work-tree
-------------   -------------   -------------
README.txt(2)   README.txt(1)   README.txt(2)
main.py(1)      main.py(2)      main.py(2)

That is, your work-tree is all the latest files. Your commit has the updated README.txt. But this ugly state means that the next commit will use the old / wrong version of README.txt! So this is why Git now re-adds README.txt to the real index, so that you get:

    HEAD            index         work-tree
-------------   -------------   -------------
README.txt(2)   README.txt(2)   README.txt(2)
main.py(1)      main.py(2)      main.py(2)

Now you're ready to make a second commit with the updated main.py if you want.

like image 72
torek Avatar answered Sep 23 '22 02:09

torek


If you want to commit just the staged version, run git commit without specifying any files.

Example:

$ echo 2 > foo
$ git add foo
$ echo 3 > foo
$ git commit -m haha

Now the staged version is committed and the unstaged changes remain in your working directory. This can be easily verified:

$ git show HEAD:foo
2
$ git diff
--- a/foo
+++ b/foo
@@ -1 +1 @@
-2
+3

Maybe let me demonstrate this behavior (git commit with file) with another example:

Perform these actions:

$ git init
$ echo 1 > foo
$ echo 1 > bar
$ git add foo bar
$ git commit -m 1

Now both foo and bar are committed

$ echo 2 > foo
$ echo 2 > bar

Now both are changed, let's stage foo and commit bar:

$ git add foo
$ git commit -m 2 bar
$ git status
Changes to be committed:
    modified: foo
$ git diff --name-only HEAD~ HEAD
bar

You see that foo is not changed in the commit, but the staged state is preserved.

like image 26
iBug Avatar answered Sep 21 '22 02:09

iBug