Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does git say 'Changes not staged for commit' and indicate the submodule folder?

I have a git submodule inside another module, added via git submodule add <...> (command issued from parent repo), so the .gitmodules file is automatically generated inside the parent repo.

Suppose I make a change to the submodule (edit: and do not commit those changes) and then navigate back out to the parent and do git add -A and then git status, it says "Changes not staged for commit: submodule dir name ... etc".

I thought git would read .gitmodules file (which the parent git generated!), realise its a git submodule directory and therefore not mention its unstaged status when I ask the parent for its status?

like image 635
sphere Avatar asked Jan 28 '19 04:01

sphere


People also ask

Why does git say changes not staged for commit?

The “changes not staged for commit” message shows when you run the “git status” command and have a file that has been changed but has not yet been added to the staging area. This is not an error message, rather a notification that you have changed files that are not in the staging area or a commit.

What is meant by changes not staged for commit?

"Changes not staged for commit" means Some changes are not staged. So, to stage all changes perfectly, run this command: git add -A. Then, run this command: git commit -m "Update"

Can you commit to submodule?

You can also change the commit that is checked out in each submodule by performing a checkout in the submodule repository and then committing the change in the parent repository. You add a submodule to a Git repository via the git submodule add command.


2 Answers

What's going on here is that your submodule repository is on a different commit than the hash ID recorded in the superproject. Your git status, run in the superproject, is telling you this, without changing it, and your git add -A apparently did not change it either.

This last part seems wrong. When I do something similar, and then use git add -A, I get:

Changes to be committed:
(use "git reset HEAD <file>..." to unstage)

        modified:   [submodule path]

If I then run two more commands, it goes back, as I expect:

$ git reset
Unstaged changes after reset:
M       [submodule path]
$ git submodule update
Submodule path [path]: checked out '[hash]'
$ git status
On branch ...
nothing to commit, working tree clean

(I suspect that you've made some change(s) in the submodule but never committed them there.)

What's going on, in fine grained detail that will let you diagnose the problem

We have one Git repository, called the superproject, that is controlling a second repository, called the submodule. The superproject actually has three separate control knobs, one of which is present in each commit, and is therefore also found in the index (since the index controls what will go into the next commit).

One of these control knobs is the file you mentioned, .gitmodules. It tells the superproject how to clone the submodule if the submodule is not yet git cloned. Once the submodule is cloned, its main job is done.

The second is your .git/config file. It contains information copied out of the .gitmodules file, which you can update if needed, if the .gitmodules file is not quite right for your own purposes (which might differ from those of whoever's in charge of the .gitmodules file). Any settings in your .git/config override those in .gitmodules. Otherwise these two places to put settings are essentially equivalent.

The last is the one causing the issue. For a submodule to become checked out into your work-tree, and hence to be useful to you, the Git that's in control of the superproject spins up a second set of Git commands. In general, you might run:

git submodule update --init

to get the submodule checked out (though if you use git clone --recursive, Git does this for you).

At this point, the superproject Git has made an almost-empty directory with the correct path. (The directory contains a .git file naming the path to the cloned repository, or in the old days or using the old style backwards-compatibility mode, contains the actual .git directory itself.) The superproject Git chdirs into this directory and tells the submodule Git:

  • run git checkout hash

Once that's happened, the path is full of files extracted from the commit whose ID is hash, which mostly makes the outer Git (the superproject) "done" with the files. But there is a side effect, because the submodule is itself a full Git repository, with everything that this means.

In particular, the subproject has its own HEAD. This HEAD is now detached and the submodule's repository's current commit is hash, so that this is in the index and work-tree of the submodule, which is of course what we wanted: the work-tree of the submodule is the path in the superproject where all the submodule files go.

But there's an interesting question to answer: Where did the superproject Git get the hash ID? The answer is: it's stored in every snapshot—well, every snapshot that uses the submodule—in the superproject, the same way every snapshot has a full, complete copy of every file. To make that happen, the index for the superproject contains a special entry of type gitlink.

This gitlink entry in the superproject index tells the superproject which hash ID to give to the submodule Git whenever the superproject tells the submodule Git: check out some particular commit.

If you, manually, navigate into the submodule, and git checkout a branch name, or any other commit by hash ID, the submodule repository's HEAD changes. It either becomes attached to the branch name, or it points to the other commit, still in detached-HEAD mode.

At this point, the submodule and the superproject are out of sync. The superproject Git does not do anything about this yet. You are in control, you choose which commit you want. You can even make new commits and git push them to some upstream. Once you've done all of the committing and git checkout-ing that you want, and have everything arranged correctly, you should climb out of the submodule work-tree back into your superproject.

Now git status and git diff will, by default—there are a ton of control knobs here too—tell you that the superproject is calling for some hash H, but the submodule has some other hash S checked-out. (They may or may not also tell you if the submodule itself needs a commit made, if you set the control knobs for this.) If you wish your next superproject commit to record, in the gitlink for this submodule, this new commit hash S, you run:

git add path-to-submodule

(or git add -A should do the same thing, which is why this is puzzling). That will update the gitlink in your index to record hash ID S rather than H, so that the next superproject commit will, on a git submodule update command, tell the submodule Git: check out commit S, as your detached HEAD.

Once the index in the superproject matches the HEAD in the actual checked-out submodule, the submodule won't be listed in the changes not staged for commit section. If the hash in the gitlink in the index does not match the hash in the gitlink in HEAD, git status will list the submodule's path in changes to be committed.

like image 85
torek Avatar answered Oct 17 '22 11:10

torek


and therefore not mention its unstaged status when I ask the parent for its status?

It would still report changes in the submodules, unless you are using (with Git 1.7.2 or more):

  • the --ignore-submodules[=<when>] option:

    git status --ignore-submodules=dirty
    
  • the configuration status.submoduleSummary

You can see the original discussion (back in 2010, for Git 1.7.x) here, which lead to that feature:

By the way, I think that route of action would make the resulting git internally consistent in that everything by default will report submodules with untracked paths in its working tree as dirty.

  • In the "Untracked" section of "git status" output, we list an untracked path in the superproject (i.e. the one in which "git status" was run) to remind the user that the path might be a new file forgotten to be added (unless of course it is ignored).
    But it does not make the working tree dirty.

  • When you have an untracked path in a submodule:

    • the submodule is listed in the "Changed but not updated" section.
      This also makes the working tree of the superproject dirty, even though the working tree of the submodule is not.

    • "git diff" output at the superproject level shows that the submodule has modifications (i.e. "-dirty" is shown), but when run inside the submodule, there is no change shown.

I think this is a misdesign at the UI level; reporting an untracked and unignored path as potential mistake to remind the user is a good thing, but the current way "status" and "diff" does so does not make much sense to me.

like image 2
VonC Avatar answered Oct 17 '22 11:10

VonC