Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

My project uses over 100 git submodules, which submodule alternative can handle a lot of repositories gracefully

I've been researching git subtree and other alternatives to git submodules. My project has well over 100 submodules and it's very unwieldy to manage them all.

Can anyone recommend a workflow that works really well with a large number of repositories that need to be kept in sync.

like image 229
Mukunda Modell Avatar asked May 06 '15 18:05

Mukunda Modell


People also ask

Why you should not use git submodules?

This is because of some major drawbacks around git submodules, such as being locked to a specific version of the outer repo, the lacking of effective merge management, and the general notion that the Git repository itself doesn't really know it's now a multi-module repository.

Is using git submodules a good idea?

Git submodules may look powerful or cool upfront, but for all the reasons above it is a bad idea to share code using submodules, especially when the code changes frequently. It will be much worse when you have more and more developers working on the same repos.

When should I use git submodules?

In most cases, Git submodules are used when your project becomes more complex, and while your project depends on the main Git repository, you might want to keep their change history separate. Using the above as an example, the Room repository depends on the House repository, but they operate separately.


2 Answers

If you project has over 100 git submodules of components and dependencies, their management will be unwieldy no matter which approach you use :-) I suggest look for ways to script and automate as many parts as possible. Trust me, the novelty of playing with and chaining git commands wear out very quickly for most people, especially when deadlines are approaching. There is already a very good answer here on the comparison of the different approaches to manage git sub-projects.

Regarding workflow, I will first separate repositories that are under your control from those that aren't i.e. 3rd party repositories.

For 3rd party repositories which don't change often (either via merges or upstream PRs), you can still use submodules. Typically, you will point these submodules to the HEAD of some stable tags. Sync-ing them it's just a matter of running (or scripting) git submodule update --recursive --remote. If these 3rd party dependencies can be specified in package management tools like bundler (for ruby projects), it will help to simplify your subprojects management.

For repositories that your own and change often, either gitslave or git-subtree are two alternatives, depending on your team's preferences.

gitslave multiplexes git operations into multiple branches. IOW, when you branch, merge, commit, push, pull etc., each command will be run on the parent project and all slaves in turn. This mandates the team to work in a top-down manner, starting from the super-project down to the slaves.

gitsubtree uses Git’s subtree merge functionality to achieve a similar effect as submodules, by actually storing the files in the main repository and merging in changes directly to that repository. The end result is a canonical repository with the option of including all the subprojects' history. In a way, this allows team members to focus more on the subtrees they are responsible for, but will require extra work to merge back to the parent tree.

As a developer, my preference is to work at the lower sub-projects level (to do my "red, green, refactor" cycle), and touch the parent projects only when necessary. But regardless of whether you choose a top-down or bottom-up workflow, try to identify repetitive error-prone steps in your branching & merging strategy, and script them as much as possible.

like image 174
ivan.sim Avatar answered Oct 11 '22 12:10

ivan.sim


I've had the same issue, not 100 submodules, but about 15-20, I built a cli to assist in commit, push, pull, rebase, checkout, etc. I also used hard linking within my applications so the cli also handles that, but its not necessary to hard link. The cli is written in go, and has releases for all sorts of os platforms

For my applications, my workflow usually has a ".boiler" folder where all my submodules go, then I hard link files within the .boiler to the "src" of my application, then when i make edits to the linked file, it updates the source file, which is in the gitsubmodule

here's the link to the cli with install instructions, of course you can just download the release and add it to any path thats in your global PATH

https://github.com/ml27299/lit-cli

like image 25
Mac Lara Avatar answered Oct 11 '22 11:10

Mac Lara