Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I manage an open-source and commercial version of the same project using source control?

We are developing an open source project, and we are using Mercurial for source management control. The Mercurial repository for this project is public (we are using Bitbucket).

Now we have a client for whom we need to customize our open source software. These customizations must be kept private, so we probably need to create a new Hg repository for this client; this new repository would be private.

But the problem is we would need to [from time to time] merge changes (such as new features or bug fixes) from the open repository into our private repository.

What is best way to achieve this? I read that it is possible to merge two or more Mercurial repositories, but the history will be lost. Also merging could be painful because of many conflicts. What if we get a few more clients in future, how we should manage their repositories? Should we use one repository and multiple branches? What if the two project versions start to head in different directions, and the two repositories become increasingly different?

Please share your experience about this.

Thanks in advance!

like image 423
Ivica Avatar asked Oct 07 '11 10:10

Ivica


People also ask

Which of the following is a repository hosting service tool that features collaboration and access control?

Github is a collaborative coding tool with version control, branching and merging all included. View Listing... Bitbucket is a cloud-based Git and Mercurial based source code management and collaboration tool.

What is the best source code repository?

Github is the largest, and most popular code repository on the web. It stores both private, and open-source projects.

Is github open source?

The largest open source community in the world. There are millions of open source projects on GitHub.


2 Answers

What you describe is a standard thing with a distributed version control system: developing in two repositories and keeping one a subset of the other. Start by making a clone for the private development:

hg clone open private

Then go into private and make the new features there. Commit as normal. The private repository will now contain more changesets than the open repository -- namely the new features.

When bugfixes and new features are put into the open repository as part of the normal open source process, then you pull them into the private repository:

cd private
hg pull
hg merge

That way you keep the invariant: the private repository always contains everything in the open version, plus the private enhancements. If you're working on the private version and discover a bug, then remember to take a look at the open version to see if the bug exist there too. If so, then fix it in the open version first and merge the bugfix into the private version. If you fix a bug in the private version by mistake, then use hg transplant to copy the bugfix over to the other open version.

There wont be any loss of history. You will have to resolve the merge like normal when you do hg merge and the conflicts will only be as large as required by your private changes.

The important thing to remember is to never push (or pull) the other way, unless you want to begin releasing some of the private changes into the open source version.

You can use this setup several times with different clients and you can also push/pull changesets between different private repositories as needed if several clients require the same private enhancement.

like image 167
Martin Geisler Avatar answered Nov 10 '22 02:11

Martin Geisler


Well in principle the basic model is relatively simple; have a separate private repository which is a clone (branch) of the public one, make all private changes on there, and then regularly merge the public one into the private one. There are no problems in regard to history preservation, I don’t know why you read that would happen.

However the challenge is to not end up with an unmaintainable merge hell, and this can only be achieved through strict discipline.

The most basic rules of thumb for any long-lived branches are:

  1. Keep the private branch as small as possible. Minimise the amount of changes in there, and keep them small so don’t start refactoring huge parts of code or change indentation. In a one-way merge situation like here, any code that you modify has the potential to conflict, even way down the line.

  2. Merge frequently. The more frequent the better. If you don’t do this, ever time you do want to integrate the changes from the public repository you will end up with one super-merge that has a ton of conflicts.

Additionally, you should also be disciplined in organising and write your code to facilitate this scenario. Have clear rules about what goes where on which branch, and sectioning off the pieces of code.

Ideally you would model the customised functionality as a plug-in or external library, a separate project even. That may not always be easily achievable, in that case at least try to write all private modifications in terms of sub-classes of the original which you instantiate with factory methods. By making all your changes in independent files that only exist on the private branch, you minimise the risk for conflicts.

Also write automated tests. Lots of them. Else you won’t promptly detect merge problems (which will happen), and the private branch will often be broken.

Finally a tip: make a push hook on the public repository that denies any push containing a changeset that you know is private; this will prevent accidental publication of the private code and potentially save you a lot of headaches.

like image 37
Laurens Holst Avatar answered Nov 10 '22 02:11

Laurens Holst