Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple repos with single submodule

I've looked for a while and didn't find answer (maybe I don't know what to look).

We've got a main library which is a repository by it self (let's call it Lib) it contains most of our modules and submodules. Let's also say it has a size of 2GB...

Now we've got many projects such as: ProjA,ProjB,ProjC each one uses the Lib as submodule.

ProjA

  • Lib (branch:master,commit#:1)

ProjB

  • Lib (branch:other,commit#:2)

ProjA

  • Lib (branch:master,commit#:4)

So while I'm able to keep every project referencing to correct library (aka submodule) version. I've got now 3*2GB = 6GB of THE SAME submodule.

Is there a way to reference to a single submodule while maintaining the correct files/versioning referenced?

Eg.

ProjA

  • Lib/base_lib.h (v1.0)

  • Lib/file_only_in_this_commit

ProjB

  • Lib/base_lib.h (v1.0)

ProjC

  • Lib/base_lib.h (v1.1)

Thanks!

like image 725
Rock_Artist Avatar asked Dec 08 '15 09:12

Rock_Artist


People also ask

Can submodules have submodules?

Cloning a Project with Submodules If you pass --recurse-submodules to the git clone command, it will automatically initialize and update each submodule in the repository, including nested submodules if any of the submodules in the repository have submodules themselves.

Can I have multiple git repositories in one folder?

gitignore file. But sometimes there are situations where we might want to do something even fancier: Have some files of a directory in one repository, and others in another repository. This is not only possible, it's surprisingly easy and doesn't require complex Git knowledge with figures showing graphs, branches, etc.

Is using git submodules a good idea?

Git submodules may look powerful or cool upfront, but for all the reasons above it is a bad idea to share code using submodules, especially when the code changes frequently. It will be much worse when you have more and more developers working on the same repos.

Can you have two repositories?

If you are working on a big project, then it is inevitable that you need to work with multiple repositories. That's why you need to sync your local code base with multiple Git remote repositories. For example, if your source code is: On Github for issues tracking.


2 Answers

Update:

I've transitioned to using submodule's --reference flag and created a new script, init_submodules to solve the problem using it.

My original/deprecated answer:

You can use git worktree (available since git 2.5) to create additional worktrees for the Lib submodule, at the locations inside ProjA, ProjB, etc.

Because git worktree makes it a pain to make several worktrees with the same name (all are called "Lib"), I just created a script, share_submodules to work around the difficulties and create the additional worktree instead of a submodule, set it to the right submodule commit, and do it recursively for all the submodules inside the shared module.

It should work as well as if the submodule was created by git submodule update --init --recursive, except all copies refer to one module's objects.

If you're transitioning to it by removing the submodule, there are stray submodule files in your .git and I created find_stray_submodules.py to clean them up.

like image 176
yairchu Avatar answered Nov 15 '22 13:11

yairchu


Well, internally the whole submodule thing is quite simple, so you can master it to your taste.

Inside each of your Proj<N>/.git/modules/ there's a folder corresponding to Lib submodule with bare repository cloned from the remote reference specified in Proj<N>/.gitmodules in Lib.url. Those bare repositories are the points of optimization.

You may simply recreate them using hardlinks where possible.

1) Create a bare clone of your Lib in a folder on the same filesystem as your all Proj repos:

 git clone --bare url://to/Lib /path/to/Lib.git

2) Replace default submodule repo with the repo, referencing the bare repo from p.1:

mv ProjA/.git/modules/Lib ProjA/.git/modules/Lib.old // preserve it for a while
git clone --bare --local url://to/Lib \
    --reference /path/to/Lib.git ProjA/.git/modules/Lib

3) Restore the config from the preserved repo in ProjA/.git/modules/Lib:

cp ProjA/.git/modules/Lib.old/config ProjA/.git/modules/Lib/config

Now you may check if everything works in ProjA and remove ProjA/.git/modules/Lib.old and so on. In this case all repos will use the same fileobjects.

In git a particular state of a submodule is referenced by a precise SHA1. Unless you perform some "evil" operations in you Lib main repo (e.g. git filter-branch or other operations which may lead to deletion of a commit), all proper commits in Lib are kept forever. Your Proj<N> check out particular commits completely independently of each other, so you shouldn't bother that a state of Lib in ProjA may interfere with another state of Lib in ProjB.

like image 32
user3159253 Avatar answered Nov 15 '22 12:11

user3159253