"Layering" git repository

Question

I'm using git on a daily basis for a while now, and this time I've run into a problem which I could describe like this.

I have a repository which holds entire website structure, and web root is in the root of the repository. Everything was fine until that was repository for a single site. However, that same repo is now used for several sites - basically the same site, in different languages, minor template tweaks, different graphics, etc. Those things are naturally versioned.

There is a master branch, which holds original source code of the site, and I'd like to have master (or some other branch) to hold code that is universal across all sites, as there will eventually be changes that are too site-specific to include in universal part of the repo.

Next, there is a branch for every single site which uses this source code. All those branches (say, site1, site2, and site3) are created from master branch, and each site clones correct branch.

Well, it seemed like a good idea, until I started making changes everywhere.

If I made a change on site1 branch, and I needed to copy that change to site2 branch, I would cherry-pick commit from one branch to another. Merging is out of the question there, as there are other changes on site1 branch which do not belong with site2 branch. Is there some other, more elegant solution for this kind of situation, or is that cherry-picking is exactly for this purpose?

Now, the real "problem" for me is when I change master, and then I want to copy all those changes to all branches. Naturally, considering the fact that all branches are descendants of master, and that I do want those changes in all site* branches, I switch to each branch and merge master.

This creates a pretty nasty-looking history for all branches. Each round of merges complicates graph considerably, which leads me to two conclusions:

this way of layering branches can work as long as I watch my step and not do anything stupid, and not trying to get any sense out of all-branches history graph. Or..
there has to be some better, more appropriate way to do it.

To illustrate my "problem", I'll give an image of graph that I got after creating those branches, adding few branch-specific commits, cherry-picking few of them, adding and merging one commit from master to all branches, commit or two to specific branches, and then one more master-to-all merge.

sort of not-so-simple history graph

I don't know, I like simplicity, and maybe I'm not used to seeing hard-to-follow graphs like this one (which will only grow in complexity with every following merge, I'm afraid).

I guess I could do cherry-picking all the way, and have neat history graph, but that doesn't sound right either, since I might do several commits in a row, and then forget to pick one of them to all other branches...

So... Any ideas, experiences, suggestions that you wouldn't mind to share?

UPDATE: I choose a solution described in my comment on accepted answer. Thanks to everyone who contributed!

UPDATE 2: Even though it's not tightly related to this question, recently I stumbled upon this model of branching that appears to be suitable for pretty much any organized development cycle, with GIT as underlying DVCS. It's a really good read. Recommended.

ralphtheninja · Accepted Answer

Alternate answer:

You could move the abstraction from branch level to repository level. Create one main repo with a master branch. Clone this repository for each site. When changes are made on the master branch, pull these changes into each site repo. This way you will only need one master branch per repo.

Original answer:

When the master branch has been changed you could rebase the other branches onto the updated master branch. Lets assume you have pl_site based on some commit on master and that master has changed:

 o---o---o---o---o  master
         \
          o---o---o---o---o  pl_site

After you have rebased pl_site, it will look like:

 o---o---o---o---o  master
                  \
                   o'---o'---o'---o'---o'  pl_site

Note that the commits on rebased version of pl_site are new. pl_site now contains the changes that were made on master.

Commands:

$ git checkout pl_site
$ git rebase master

Dietrich Epp · Answer

I don't have a good answer for you, because your problem is complicated and has complicated solutions.

Option 1: Refactor

You said that the different sites are "basically the same site". So move them to different projects, and keep the main_site in a project by itself. The other sites will then include main_site as a subproject.

So, for the banner...

en_site/
en_site/images/banner.jpg
en_site/master/
en_site/master/images/banner.jpg

Your web site code, configuration script, deployment script, or whatever will make sure that images/banner.jpg is chosen over master/images/banner.jpg. Maybe when you deploy the site master/images gets copied first and then images gets copied over it, maybe you do something more sophisticated.

This might be a lot of work. However, when you look at the history, you'll get something like this:

en_site: A -> B -> C -> D
de_site: E -> F -> G -> H -> I
main_site: J -> K -> L -> M -> N -> O

Option 2: Use Darcs

In Darcs, you can move patches from branch to branch. Some commercial VCSs can probably do this too. So your branches would look like this:

master: patch1 patch2 patch3 patch4
en_site: patch1 patch2 patch3 patch4 en1 en2 en3
de_site: patch1 patch2 patch3 patch4 de1 de2 de3

Suppose that you want to port patch en2 to the German site.

de_site: patch1 patch2 patch3 patch4 de1 de2 de3 en2

Voila. However, this is not as clean as it looks. Darcs aficionados will point out that this patch model matches our conceptual model of "moving a patch to another branch", however, this glosses over the fact that you'll still have to test to make sure that the en2 patch doesn't break everything when you put it on de_site.

For example, what if en2 makes a change to the same part of the code as de1? What then? You have to merge manually, no matter what VCS you are using. For every obvious case like this, there is another case which the VCS won't detect and you'll have to check it yourself.

My experience

When I first started using git, it seemed like git merge was magic. However, no amount of VCS trickery is going to hide the fact that your site has some very complicated interdependencies. You can either refactor your site to remove the interdependencies, or hope that your VCS history doesn't become so complicated that you can no longer understand it.

The tradeoff between new branches, new projects, and refactoring things into libraries is a delicate tradeoff. Maintaining a large collection of patchsets is much more work than maintaining a large collection of projects which all use a common library, the (large) amount of work necessary to refactor may pay off quickly. Or it may not.

"Layering" git repository

Tags:

git

git-branch

repository-design

mr.b

2 Answers

ralphtheninja

Dietrich Epp

Recent Activity

Donate For Us

"Layering" git repository

Tags:

git

git-branch

repository-design

mr.b

2 Answers

ralphtheninja

Dietrich Epp

Related questions

Recent Activity

Donate For Us