We are having a lot of problems recently with our Git repositories. We are users of git submodules for a total of 4 shared repositories between our applications.
For example, repository 'website' has a total of 3 submodules.
[submodule "vendor/api"]
path = vendor/api
url = [email protected]:api
[submodule "vendor/auth"]
path = vendor/auth
url = [email protected]:auth
[submodule "vendor/tools"]
path = vendor/tools
url = [email protected]:tools
We have checked out correctly our main repository 'website'. Now one of my co-workers have done a push, then I git pull; git status
:
# On branch master
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: vendor/api (new commits)
# modified: vendor/auth (new commits)
# modified: vendor/tools (new commits)
#
no changes added to commit (use "git add" and/or "git commit -a")
mcfly@future:~/projects/website$ git diff
diff --git a/vendor/api b/vendor/api
index 41795fc..b582d80 160000
--- a/vendor/api
+++ b/vendor/api
@@ -1 +1 @@
-Subproject commit 41795fc4dde464d633f4c0f01eebb6ab1ad55582
+Subproject commit b582d802419b0ee7bc3959e7623fec0b94680269
diff --git a/vendor/auth b/vendor/auth
index a00369b..4599a71 160000
--- a/vendor/auth
+++ b/vendor/auth
@@ -1 +1 @@
-Subproject commit a00369bf29f14c761ce71f7b95aa1e9c107fb2ed
+Subproject commit 4599a7179c9b7ca4afa610a15ffa4a8fc6ebf911
diff --git a/vendor/tools b/vendor/tools
index f966744..c678cf6 160000
--- a/vendor/tools
+++ b/vendor/tools
@@ -1 +1 @@
-Subproject commit f966744359510656b492ae3091288664cdb1410b
+Subproject commit c678cf6f599fc450e312f0459ffe74e593f5890f
What's the problem with that git diff
? The problem is that the new commits for each submodule are OLDER than the ones that will be overwritten. That's not what we want because on the repository is pointing correctly to 41795fc4dde464d633f4c0f01eebb6ab1ad55582
, a00369bf29f14c761ce71f7b95aa1e9c107fb2ed
and f966744359510656b492ae3091288664cdb1410b
and if we add this modifications to our next commit we will probably brake the things. I don't know why its getting the oldest revision and not the newest.
I have tried to solve this by myself but with no success:
mcfly@future:~/projects/website$ git pull; git submodule foreach git pull
Doing the last command it's not correct because we will probably update the pointer of 'website' to the newest of each submodule and we don't want this. We want to preserve the correct revision that it's on the repository.
One of the things that I've to explain that we usually work inside this submodules, for example:
mcfly@future:~/projects/website$ cd vendor/api
mcfly@future:~/projects/website/vendor/api$ git checkout master
mcfly@future:~/projects/website/vendor/api$ echo "lorem ipsum" >> example.file
mcfly@future:~/projects/website/vendor/api$ git add example.file; git push
When we do a git submodule update
the 'master' branch is lost on every submodule.
Finally, what si the correct way of doing the push
, pull
and working with submodules and not having all this problems?
Thank you in advance
Take a look at the git-scm documention and pass it around to your team. The phenomenon you're seeing is exactly described in the "Cloning a Project with Submodules" section.
First, the initial state you observed, where git diff
shows unexpectedly opposite results for those commit hashes, indicates you merged a submodule update in the parent repo, but didn't run git submodule update
locally. You have to run git submodule update
every time you pull down a submodule change in the main project. Why? The submodule's pointer, i.e. what the parent repository thinks is the state of vendor/auth
, isn't actually the HEAD
commit of the submodule repository vendor/auth
. It's a little confusing until you understand how git is tracking the submodule states. Again, the git-scm documentation is worth a read.
Second, git submodule update
abandons the master
branch on the submodule by design. Check out the "Issues with submodules" section of those docs. The man page, as is often true with git, tells us what we need to know:
update
Update the registered submodules, i.e. clone missing submodules and checkout the commit specified in the index of the containing repository. This will
make the submodules HEAD be detached unless --rebase or --merge is specified or the key submodule.$name.update is set to rebase, merge or none. none
can be overridden by specifying --checkout.
You're putting your submodules in 'detached HEAD
' state every time you issue git submodule update
without argument.
So how do you work with submodules without having these problems? First, ask yourself and your team: Do we really need them? Submodules are a powerful and useful feature in some cases, but they were designed more for 3rd party libraries than active projects fractured into sub-repositories. You can certainly use them this way, but the management overhead might rapidly exceed whatever benefits you're getting. Unless your repository is quite large, or your submodules are completely modular, it's worth asking "Would we be better off with a single repository?" Even if the answer is "no", check out subtree merging, which might be more successful for your use case.
If you'd still like to use submodules, check out the docs linked above, as well as the many questions and answers on SO and other sites about submodule workflows. They should help you achieve a saner process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With