Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git submodules workflow issues

We are having a lot of problems recently with our Git repositories. We are users of git submodules for a total of 4 shared repositories between our applications.

For example, repository 'website' has a total of 3 submodules.

[submodule "vendor/api"]
    path = vendor/api
    url = [email protected]:api
[submodule "vendor/auth"]
    path = vendor/auth
    url = [email protected]:auth
[submodule "vendor/tools"]
    path = vendor/tools
    url = [email protected]:tools

We have checked out correctly our main repository 'website'. Now one of my co-workers have done a push, then I git pull; git status:

# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#   modified:   vendor/api (new commits)
#   modified:   vendor/auth (new commits)
#   modified:   vendor/tools (new commits)
#
no changes added to commit (use "git add" and/or "git commit -a")

mcfly@future:~/projects/website$ git diff

diff --git a/vendor/api b/vendor/api
index 41795fc..b582d80 160000
--- a/vendor/api
+++ b/vendor/api
@@ -1 +1 @@
-Subproject commit 41795fc4dde464d633f4c0f01eebb6ab1ad55582
+Subproject commit b582d802419b0ee7bc3959e7623fec0b94680269
diff --git a/vendor/auth b/vendor/auth
index a00369b..4599a71 160000
--- a/vendor/auth
+++ b/vendor/auth
@@ -1 +1 @@
-Subproject commit a00369bf29f14c761ce71f7b95aa1e9c107fb2ed
+Subproject commit 4599a7179c9b7ca4afa610a15ffa4a8fc6ebf911
diff --git a/vendor/tools b/vendor/tools
index f966744..c678cf6 160000
--- a/vendor/tools
+++ b/vendor/tools
@@ -1 +1 @@
-Subproject commit f966744359510656b492ae3091288664cdb1410b
+Subproject commit c678cf6f599fc450e312f0459ffe74e593f5890f

What's the problem with that git diff? The problem is that the new commits for each submodule are OLDER than the ones that will be overwritten. That's not what we want because on the repository is pointing correctly to 41795fc4dde464d633f4c0f01eebb6ab1ad55582, a00369bf29f14c761ce71f7b95aa1e9c107fb2ed and f966744359510656b492ae3091288664cdb1410b and if we add this modifications to our next commit we will probably brake the things. I don't know why its getting the oldest revision and not the newest.

I have tried to solve this by myself but with no success:

mcfly@future:~/projects/website$ git pull; git submodule foreach git pull

Doing the last command it's not correct because we will probably update the pointer of 'website' to the newest of each submodule and we don't want this. We want to preserve the correct revision that it's on the repository.

One of the things that I've to explain that we usually work inside this submodules, for example:

mcfly@future:~/projects/website$ cd vendor/api
mcfly@future:~/projects/website/vendor/api$ git checkout master
mcfly@future:~/projects/website/vendor/api$ echo "lorem ipsum" >> example.file
mcfly@future:~/projects/website/vendor/api$ git add example.file; git push

When we do a git submodule update the 'master' branch is lost on every submodule.

Finally, what si the correct way of doing the push, pull and working with submodules and not having all this problems?

Thank you in advance

like image 332
blacksoul Avatar asked Aug 22 '12 14:08

blacksoul


1 Answers

Take a look at the git-scm documention and pass it around to your team. The phenomenon you're seeing is exactly described in the "Cloning a Project with Submodules" section.

First, the initial state you observed, where git diff shows unexpectedly opposite results for those commit hashes, indicates you merged a submodule update in the parent repo, but didn't run git submodule update locally. You have to run git submodule update every time you pull down a submodule change in the main project. Why? The submodule's pointer, i.e. what the parent repository thinks is the state of vendor/auth, isn't actually the HEAD commit of the submodule repository vendor/auth. It's a little confusing until you understand how git is tracking the submodule states. Again, the git-scm documentation is worth a read.

Second, git submodule update abandons the master branch on the submodule by design. Check out the "Issues with submodules" section of those docs. The man page, as is often true with git, tells us what we need to know:

update
   Update the registered submodules, i.e. clone missing submodules and checkout the commit specified in the index of the containing repository. This will
   make the submodules HEAD be detached unless --rebase or --merge is specified or the key submodule.$name.update is set to rebase, merge or none.  none
   can be overridden by specifying --checkout.

You're putting your submodules in 'detached HEAD' state every time you issue git submodule update without argument.

So how do you work with submodules without having these problems? First, ask yourself and your team: Do we really need them? Submodules are a powerful and useful feature in some cases, but they were designed more for 3rd party libraries than active projects fractured into sub-repositories. You can certainly use them this way, but the management overhead might rapidly exceed whatever benefits you're getting. Unless your repository is quite large, or your submodules are completely modular, it's worth asking "Would we be better off with a single repository?" Even if the answer is "no", check out subtree merging, which might be more successful for your use case.

If you'd still like to use submodules, check out the docs linked above, as well as the many questions and answers on SO and other sites about submodule workflows. They should help you achieve a saner process.

like image 171
Christopher Avatar answered Nov 07 '22 22:11

Christopher