Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git repo where each submodule is a branch of same repo. How to avoid double/triple... download with git clone --recursive?

Tags:

git

Suppose I have the following project tree:

src
data
doc

I'd like to keep all the folders in a Git repository, published to Gitlab. But I don't want to track data and doc together with src.

So I use the following strategy:

git remote add origin ADDRESS
git submodule add -b data ADDRESS data
git submodule add -b doc ADDRESS doc

It actually works fine, except when I try to replicate the repository with:

git clone --recursive ADDRESS

all objects get transmitted 3 times: both the root and data and doc all contain:

  • origin/master
  • origin/data
  • origin/doc

Is there an easy way to avoid this? Just to clarify what I'd like:

  • the master repository should only fetch origin/master, not the other two
  • the data submodule should only fetch origin/data.
  • the doc submodule should only fetch origin/doc.

Would be easy to achieve with 3 separate repositories, but that's too cumbersome, since I apply this approach for multiple projects.

UPDATE

git worktree from this answer allows me to achieve what I want manually.

But now, instead of the automatic approach (which consumes 4x bandwidth):

git clone --recursive git@foo:foo/bar.git

I have to do:

git clone git@foo:foo/bar.git
cd bar
git worktree add data origin/data
git worktree add src/notebooks origin/notebooks
git worktree add doc origin/doc
git worktree add reports origin/reports

I could automate this process with some scripts, since .gitmodules file already contains the complete info:

[submodule "data"]
    path = data
    url = git@foo:foo/bar.git
    branch = data
[submodule "src/notebooks"]
    path = src/notebooks
    url = git@foo:foo/bar.git
    branch = notebooks
[submodule "doc"]
    path = doc
    url = git@foo:foo/bar.git
    branch = doc
[submodule "reports"]
    path = reports
    url = git@foo:foo/bar.git
    branch = reports

I wonder if there already is some standard git script or flag that handles this?

like image 894
abo-abo Avatar asked Apr 11 '17 20:04

abo-abo


1 Answers

Git is designed to be distributed, that means every user should have whole history and all branches. If you want to have a single bare repo, but different working trees to reduce network traffic, you can do it using git worktree command:

So in your case, let's say you have a src folder as a main folder with src branch, creating other two from it should be as simple as

git worktree add ../data data
git worktree add ../doc doc

See this awesome answer https://stackoverflow.com/a/30185564/3066081 to get more info about this command. But if you have an older git without worktree support, you can use git-new-workdir script as

git-new-workdir project-dir new-workdir branch

This is also described in Multiple working directories with Git?

like image 142
Andrew Avatar answered Sep 19 '22 16:09

Andrew