Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

working with git in a web-project for multiple customers

Is there a better proposal to version control web-projects with small random updates in several customer projects with git?

I want to use git to version control for web projects. The main difference to almost all other proposals are that this is a web project using HTML, JavaScript and some PHP files - no central libraries used by one ore more programs, like usual in typical Linux packages.

All my different web projects are for different customers basing on the same platform files, I would estimate 80% of the files are identical (call them platform) and 20% are modified for different customers to fit to their needs. Problem here is, that I don't know for which files we need a customer update - in detail every customer is different.

Best would be to keep the platform specific files in one directory and overlay these files with customer specific files in another directory. To solve this with git I found nothing really good so far:

  • git submodule (like proposed here) typically designed to have the sources of a vendor developed library close to the program who link it. Therefore the problem is that the platform and the customer files are in different directories, so I have to mix them during deployment to create the files for the web-server. Furthermore I have to keep the directory trees in sync manually and that would be a hell a lot of work with 10 directory deep hierarchies. In general a lot of postings grumble about the big administrative effort using submodules, it looks like it is overkill.
  • git subtree (like proposed here) seems to be simpler than submodule but suffers from the same problem with different directories, so I also need to keep the dir structure in sync and mix the files during deployment. Furthermore it is difficult to push platform changes back from customer repo.
  • GitSlave (like proposed here) I'm not sure whether this can be of benefit for me. It allows keeping several git repos in sync, maybe it helps syncing the dir structure of platform, but I can't believe it
  • Refactor between platform and customer files in different directories (like the result of this discussion) I think this is simply impossible in case of my customers and the technology used by web projects. For one customer this page need an update, for another that page. Even when introducing a PHP-framework the customer specific changes are spread over the whole tree.
  • Checkouts (like also proposed in this discussing in the last posting) This looks very simple and promising, with the drawback that all the customer specific files are outside of git (so outside of version control). Furthermore in case a file is updated in platform and in customer, the git pull fails - it aborts, so this is not usable
  • Vendor Branches (like recommenced here) as I have learned, branches are made to be merged back, and that is not aimed for my customer specific patches. These branches would be always open, only merged after an update from the platform (main) towards customer. And this will lead to a mega-lit repo keeping all customers and the platform information - not the git way of handling repos.
  • Mix during deployment. So a very pragmatic method of keeping the platform files in one repo and the customer files also in dedicated repos. During deployment of the files to the web-server, it can first write all platform files and than overwrite some of them by the platform specific files. The mixture happens very late in the web servers directory. This also have the drawback that the directory structure of each customer have to be manually kept in sync with the platform structure - otherwise the deployment would be too complex.

What is the best approach here?

like image 797
Achim Avatar asked Aug 11 '12 14:08

Achim


2 Answers

TL;DR

This is actually an architectural design problem, not a source code management problem. Nevertheless, it's a common and interesting problem, so I'm offering some general advice on how to address your architectural issues.

Not Really a Git Problem

The problem isn't really Git here. The issue is that you haven't adequately differentiated what remains the same vs. what will change between customers. Once you've determined the correct design pattern, the appropriate source control model will become more obvious.

Consider this quote from Russ Olsen:

[Separate] the things that are likely to change from the things that are likely to stay the same. If you can identify which aspects of your system design are likely to change, you can isolate those bits from the more stable parts.

Olsen, Russ (2007-12-10). Design Patterns in Ruby (Kindle Locations 586-588). Pearson Education (USA). Kindle Edition.

Some Refactoring Suggestions

I don't know your application well enough to offer concrete advice, but in general web projects can benefit from a couple of different design patterns. The template, composite, or prototype patterns might all be applicable, but sometimes discussing patterns confuses the issue more than it helps.

In no particular order, here's what I would personally do:

  1. At the view layer, rely heavily on templates. Make heavy use of layouts, includes, or partials, so that you can more easily compose presentation-layer objects.
  2. Make heavy use of customer-specific configuration files (I rather like YAML for this purpose) to allow easier customization without modifying core code.
  3. At the model and controller layers, choose some appropriate structural patterns to allow your objects to behave polymorphically based on your customer-specific configuration files. Duck-typing is your friend here!
  4. Use some introspection based on hostname or domain, enabling polymorphic behavior for each client.

Next Steps with Git

Once you've refactored your application to minimize the changes between customers, you may find you don't even need to keep your code separate at all unless you're trying to hide polymorphic code from each client. If such is the case, you can certainly investigate submodules or separate branches at that point, but without the burden of heavy duplication between branches.

Symlinks are Your Friends, Too

Lastly, if you find that you can isolate changes into a few subdirectories, Git supports symlinks. You could simply have all your varied code in a per-client subdirectory on your development branch, and symlink the files into the right places on your per-client release branches. You can even automate this with some shell scripts or during automated deployments.

This keeps all your development code in one place for easy comparisons and refactoring (e.g. the development branch), but ensures that code that really does need to be different for each release is where it needs to be when you roll it out into production.

like image 134
Todd A. Jacobs Avatar answered Nov 02 '22 17:11

Todd A. Jacobs


Vendor branches make the most sense due to the nature of how you customize your solution for each vendor. The best way to go about it is to forgo this and develop a multi-tenant application.

like image 2
Adam Dymitruk Avatar answered Nov 02 '22 17:11

Adam Dymitruk