I have around 20 different repositories. Many are independent and compile as libraries but some others have dependencies among them. Dependency resolution and branching is complicated. Suppose that I have a super project that only aggregates all other repositories. It is used exclusively to run tests -- no real development goes here. <pre class="prettyprint"><code>/superproject [master, HEAD] /a [master, HEAD] /b [master, HEAD] /c [master, HEAD] /... </code></pre> Now, to develop specific features or fixes for each one (<code>a</code>), especially one of those that require specific versions of projects to compile or run (<code>b v2.0</code> and <code>c 3.0</code>) I have to create a new branch: <pre class="prettyprint"><code>/superproject [branch-a, HEAD] <-- branch for 'a' project /a [master] <-- new commits here /b [v2.0] /c [v3.0] </code></pre> For <code>b</code>, it might be required something else, like <code>a v0.9</code> and <code>c v3.1</code>: <pre class="prettyprint"><code>/superproject [branch-b, HEAD] <-- branch for 'b' project /a [v0.9] <-- older version than 'a' /b [master] <-- new commits go here /c [v3.1] <-- newer version than 'a' </code></pre> This becomes even more complex and complicated when implementing common git workflows involving feature branches, hotfix branches, release branches, etc. I was advised to (and advised against) using <code>git-submodules</code>, <code>git-subtree</code>, google's <code>git-repo</code>, <code>git-slave</code>, etc. How can I manage continuous integration for such a complex project? EDIT The real question is how to run tests without having to mock all other dependent projects? Especially when all projects might use different versions. Trigger Jenkins tests after commits in git submodules

For working with multiple branches in parallel, use paralleled clones if possible. <code>cd</code> is an awful lot easier than checkout and clean and check-for-stale-detritus and recreate-caches every time you want to switch. <hr> So far as recording your test environments goes, what you're describing is exactly what submodules do, in every detail. For something this simple, I'm going to recommend setting yourself up without using the submodule command at all, and telling it about your setup once you're comfortable and the top item on your submodule-issues list is keystroke count. Starting from the setup in your question, here's how you set yourself up to record clean builds in the subprojects: <pre class="prettyprint"><code>cd $superproject git init . git add a b c etc git commit -m "recording test state for $thistest" </code></pre> That's it. You've committed a list of commit id's, i.e. the id's of the currently-checked-out commits in each of those repos. The actual content is in those repos, not this one, but that's the entire difference between files and submodules so far as git's concerned. The <code>.gitmodules</code> file has random notes to help cloners, mainly a suggested repo that's supposed to contain the necessary commits, and random notes for command defaults, but what it's doing is easy and obvious. Want to check out the right commit at path <code>foo</code>? <pre class="prettyprint"><code>(commit=`git rev-parse :foo`; cd foo; git checkout $commit) </code></pre> The rev-parse fetches the content id for foo from the index, the cd and checkout do that. Here's how you find all your submodules and what should be checked out there to recreate the staged aka indexed environment: <pre class="prettyprint"><code>git ls-files -s | grep ^16 </code></pre> Check what your current index lists for a submodule and what's actually checked out there: <pre class="prettyprint"><code>echo $(git rev-parse :$submodule; (cd $submodule; git rev-parse HEAD)) </code></pre> and there you go. Check out the right commits in all your submodules? <pre class="prettyprint"><code>git ls-files -s | grep ^16 | while read mode commit stage path; do (cd "$path"; git checkout $commit) done </code></pre> Sometimes you're carrying local patches you want applied to every checkout: <pre class="prettyprint"><code>git ls-files -s | grep ^16 | while read mode commit stage path; do (cd $path; git rebase $commit) done </code></pre> and so forth. There's <code>git submodule</code> commands for these, but they're not doing anything you don't see above. Same for all the rest, you can translate everything they do into near-oneliners like the ones above. There's nothing mysterious about submodules. <hr> Continuous integration is generally done with any of a whole lot of tools, I'll leave that for someone else to address.

As the author, <code>git slave</code> could work in this situation. How to use it would depend on whether you had control over repos <code>a</code> <code>b</code> and <code>c</code>; by which I mean you could cause the branch strategy to be synchronized between them so that the v2 branch meant the same thing for everyone. If this is true, I would strongly urge <code>git slave</code> since you can essentially treat it as one large project. If you could not mandate a common branch and tag strategy then you would impose one, which is getting more towards a lightweight version of the workflow that jthill suggested with <code>git submodules</code>. Specifically, you could have your own repo tracking <code>a</code> <code>b</code> and <code>c</code> and create a <code>branch a</code> branch in each one, which would correspond to whatever the correct branches for each slave repo is. Like <code>git submodules</code> you would have to manually bring each repo up to date (merge in this case). However, you would not need to do the mother-may-I step of making the commit in the superproject. Using this technique is not the slam-dunk use case of having the slave projects share the same branch name when they do their own development, but it will work. As jthill said, continuous integration is pretty much orthoginal to the question of how to wrangle the projects.

Best practices for multiple git repositories

Tags:

git

git-submodules

git-subtree

git-repo

git-slave

I have around 20 different repositories. Many are independent and compile as libraries but some others have dependencies among them. Dependency resolution and branching is complicated.

Suppose that I have a super project that only aggregates all other repositories. It is used exclusively to run tests -- no real development goes here.

/superproject  [master, HEAD]
    /a         [master, HEAD]
    /b         [master, HEAD]
    /c         [master, HEAD]
    /...

Now, to develop specific features or fixes for each one (a), especially one of those that require specific versions of projects to compile or run (b v2.0 and c 3.0) I have to create a new branch:

/superproject  [branch-a, HEAD]  <-- branch for 'a' project
    /a         [master]  <-- new commits here
    /b         [v2.0]
    /c         [v3.0]

For b, it might be required something else, like a v0.9 and c v3.1:

/superproject  [branch-b, HEAD]  <-- branch for 'b' project
    /a         [v0.9]   <-- older version than 'a'
    /b         [master] <-- new commits go here
    /c         [v3.1]   <-- newer version than 'a'

This becomes even more complex and complicated when implementing common git workflows involving feature branches, hotfix branches, release branches, etc. I was advised to (and advised against) using git-submodules, git-subtree, google's git-repo, git-slave, etc.

How can I manage continuous integration for such a complex project?

EDIT

The real question is how to run tests without having to mock all other dependent projects? Especially when all projects might use different versions. Trigger Jenkins tests after commits in git submodules

948

asked Jul 03 '15 17:07

betodelrio

2 Answers

For working with multiple branches in parallel, use paralleled clones if possible. cd is an awful lot easier than checkout and clean and check-for-stale-detritus and recreate-caches every time you want to switch.

So far as recording your test environments goes, what you're describing is exactly what submodules do, in every detail. For something this simple, I'm going to recommend setting yourself up without using the submodule command at all, and telling it about your setup once you're comfortable and the top item on your submodule-issues list is keystroke count.

Starting from the setup in your question, here's how you set yourself up to record clean builds in the subprojects:

cd $superproject
git init .
git add a b c etc
git commit -m "recording test state for $thistest"

That's it. You've committed a list of commit id's, i.e. the id's of the currently-checked-out commits in each of those repos. The actual content is in those repos, not this one, but that's the entire difference between files and submodules so far as git's concerned. The .gitmodules file has random notes to help cloners, mainly a suggested repo that's supposed to contain the necessary commits, and random notes for command defaults, but what it's doing is easy and obvious.

Want to check out the right commit at path foo?

(commit=`git rev-parse :foo`; cd foo; git checkout $commit)

The rev-parse fetches the content id for foo from the index, the cd and checkout do that.

Here's how you find all your submodules and what should be checked out there to recreate the staged aka indexed environment:

git ls-files -s | grep ^16

Check what your current index lists for a submodule and what's actually checked out there:

echo $(git rev-parse :$submodule; (cd $submodule; git rev-parse HEAD))

and there you go. Check out the right commits in all your submodules?

git ls-files -s | grep ^16 | while read mode commit stage path; do
        (cd "$path"; git checkout $commit)
done

Sometimes you're carrying local patches you want applied to every checkout:

git ls-files -s | grep ^16 | while read mode commit stage path; do
        (cd $path; git rebase $commit)
done

and so forth. There's git submodule commands for these, but they're not doing anything you don't see above. Same for all the rest, you can translate everything they do into near-oneliners like the ones above.

There's nothing mysterious about submodules.

Continuous integration is generally done with any of a whole lot of tools, I'll leave that for someone else to address.

155

answered Oct 20 '22 12:10

jthill

As the author, git slave could work in this situation. How to use it would depend on whether you had control over repos a b and c; by which I mean you could cause the branch strategy to be synchronized between them so that the v2 branch meant the same thing for everyone. If this is true, I would strongly urge git slave since you can essentially treat it as one large project.

If you could not mandate a common branch and tag strategy then you would impose one, which is getting more towards a lightweight version of the workflow that jthill suggested with git submodules. Specifically, you could have your own repo tracking a b and c and create a branch a branch in each one, which would correspond to whatever the correct branches for each slave repo is. Like git submodules you would have to manually bring each repo up to date (merge in this case). However, you would not need to do the mother-may-I step of making the commit in the superproject. Using this technique is not the slam-dunk use case of having the slave projects share the same branch name when they do their own development, but it will work.

As jthill said, continuous integration is pretty much orthoginal to the question of how to wrangle the projects.

answered Oct 20 '22 12:10

Seth Robertson

Related questions
                            
                                List all commits in a topic branch
                            
                                GitBucket: error: Your local changes to the following files would be overwritten by merge
                            
                                Can I see what commands git-gui is executing?
                            
                                How to make Head point to master in git?
                            
                                Visual Studio 2013 Git Version Control - SSH Key
                            
                                What's the best way to develop a library using composer?
                            
                                Why do I have to stash / commit my changes before switching branches?
                            
                                Can I have "git stash" to automatically include untracked files by default?
                            
                                git pull remote master in detached head
                            
                                Xcode 5 - How to use source control with a workspace
                            
                                Managing Git or bitbucket repositories through Sublime text [closed]
                            
                                How to check if a directory is a git repository in C#
                            
                                git: push deleted branch to remote
                            
                                AWS - Installing bower components via npm install
                            
                                Is it OK to use a long non-ascii name for the user.name Git configuration?
                            
                                Swift: Type 'ViewController' does not conform to protocol 'UIPageViewControllerDataSource'
                            
                                A complete backup of a git branch
                            
                                show git log timestamps in ISO format in user's timezone?
                            
                                Specify Git Rename after changes
                            
                                Why my git is not updating from 1.7.1 to 1.9.4

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With