We develop .NET Enterprise Software in C#. We are looking to improve our version control system. I have used mercurial before and have been experimenting using it at our company. However, since we develop enterprise products we have a big focus on reusable components or modules. I have been attempting to use mercurial's sub-repos to manage components and dependencies but am having some difficulties. Here are the basic requirements for source control/dependency management:
Here is the structure in mercurial that I have been using:
SHARED1_SLN-+-docs
|
+-libs----NLOG
|
+-misc----KEY
|
+-src-----SHARED1-+-proj1
| +-proj2
|
+-tools---NANT
SHARED2_SLN-+-docs
|
+-libs--+-SHARED1-+-proj1
| | +-proj2
| |
| +-NLOG
|
+-misc----KEY
|
+-src-----SHARED2-+-proj3
| +-proj4
|
+-tools---NANT
PROD_SLN----+-docs
|
+-libs--+-SHARED1-+-proj1
| | +-proj2
| |
| +-SHARED2-+-proj3
| | +-proj4
| |
| +-NLOG
|
+-misc----KEY
|
+-src-----prod----+-proj5
| +-proj6
|
+-tools---NANT
If Bob is working on PROD1 and Alice is working on SHARED1, how can Bob know when Alice commits changes to SHARED1. Currently with Mercurial, Bob is forced to manually pull and update within each subrepo. If he pushes/pulls to the server from PROD_SLN repo, he never knows about updates to subrepos. This is described at Mercurial wiki. How can Bob be notified of updates to subrepos when he pulls the latest of PROD_SLN from the server? Ideally, he should be notified (preferable during the pull) and then have to manually decide which subrepos he wants to updated.
Assume SHARED1 references NLog v1.0 (commit/rev abc in mercurial) and SHARED2 references Nlog v2.0 (commit/rev xyz in mercurial). If Bob is absorbing these two components in PROD1, he should be be made aware of this discrepancy. While technically Visual Studio/.NET would allow 2 assemblies to reference different versions of dependencies, my structure does not allow this because the path to NLog is fixed for all .NET projects that depend on NLog. How can Bob know that two of his dependencies have version conflicts?
If Bob is setting up the repository structure for PROD1 and wants to include SHARED2, how can he know what dependencies are required for SHARED2? With my structure, he would have to manually clone (or browse on the server) the SHARED2_SLN repo and either look in the libs folder, or peak at the .hgsub file to determine what dependencies he needs to include. Ideally this would be automated. If I include SHARED2 in my product, SHARED1 and NLog are auto-magically included too, notifying me if there is version conflict with some other dependency (see question 2 above).
Is mercurial the correct solution?
Is there a better mercurial structure?
Is this a valid use for subrepos (i.e. Mercurial developers marked subrepos as a feature of last resort)?
Does it make sense to use mercurial for dependency management? We could use yet another tool for dependency management (maybe an internal NuGet feed?). While this would work well for 3rd party dependencies, it really would create a hassle for internally developed components (i.e. if they are actively developed, developers would have to constantly update the feed, we would have to serve them internally, and it would not allow components to be modified by a consuming project (Note 8 and Question 2).
Do you have better a solution for Enterprise .NET software projects?
I have read several SO questions and found this one to be helpful, but the accepted answer suggests using a dedicated tool for dependencies. While I like the features of such a tool it does not allowed for dependencies to be modified and committed from a consuming project (see Bigger Question 4).
This may not be the answer you were looking for, but we have recent experience of novice Mercurial users using sub-repos, and I've been looking for an opportunity to pass on our experience...
In summary, my advice based on experience is: however appealing Mercurial sub-repos may be, do not use them. Instead, find a way to lay out your directories side-by-side, and to adjust your builds to cope with that.
However appealing it seems to be to tie together revisions in the sub-repo with revisions in the parent repo, it just doesn't work in practice.
During all the preparation for the conversion, we received advice from multiple different sources that sub-repos were fragile and not well-implemented - but we went ahead anyway, as we wanted atomic commits between repo and sub-repo. The advice - or my understanding of it - talked more about the principles rather than the practical consequences.
It was only once we went live with Mercurial and a sub-repo, that I really understood the advice properly. Here (from memory) are examples of the sorts of problems we encountered.
All of these things are annoying enough in the hands of expert users - but when you are rolling out Mercurial to novice users, they are a real nightmare, and the source of much wasted time.
So, having put in a lot of time to get a conversion with a sub-repo, several weeks later we then converted the sub-repo to a repo. Because we had large amounts of history in the conversion that referred to the sub-repo, via .hgsubstate, it's left us with something much more complicated.
I only wish I'd really appreciated the practical consequences of all the advice much earlier on, e.g. in Mercurial's Features of Last Resort page:
But I need to have managed subprojects!
Again, don't be so sure. Significant projects like Mozilla that have tons of dependencies do just fine without using subrepos. Most smaller projects will almost certainly be better off without using subrepos.
Edit: Thoughts on shell repos
With the disclaimer I don't have any experience of them...
No, I don't think many of them are. You are still using sub-repos, so all the same user issues apply (unless you can provide a wrapper script for every step, of course, to remove the need for humans to supply the correct options to handle sub-repos.)
Also note that the wiki page you quoted does list some specific issues with shell repos:
- overly-strict tracking of relationship between project/ and somelib/
- impossible to check or push project/ if somelib/ source repo becomes
- unavailable lack of well-defined support for recursive diff, log, and
- status recursive nature of commit surprising
Edit 2 - do a trial, involving all your users
The point at which we really started realising we had an issue was once multiple users started making commits, and pulling and pushing - including changes to the sub-repo. For us, it was too late in the day to respond to these issues. If we'd known them sooner, we could have responded much more easily and simply.
So at this point, the best advice I think I can offer is to recommend that you do a trial run of the project layout before the layout is carved in stone.
We left the full-scale trial until too late to make changes, and even then people only made changes in the parent repo, and not the sub-repos - so we still didn't see the full picture until too late.
In other words, whatever layout you consider, create a repository structure in that layout, and get lots of people making edits. Try to put enough real code into the various repos/sub-repos so that people can make real edits, even though they will be throw-way ones.
Possible outcomes:
This command, when executed in the parent "shell" repo will traverse all subrepos and list changesets on from the default pull location that are not present:
hg incoming --subrepos
The same thing can be accomplished by clicking on the "Incoming" button on the "Synchronize" pane in TortoiseHg if you have the "--subrepos" option checked (on the same pane).
Thanks to the users in the mercurial IRC channel for helping here.
First I need to modify my repo structures so that the parent repos are truly "shell" repos as recommended on the hg wiki. I will take this to the extreme and say that the shell should contain no content, only subrepos as children. In summary, rename src to main, move docs into the subrepo under main, and change the prod folder to a subrepo.
SHARED1_SLN-+-libs----NLOG
|
+-misc----KEY
|
+-main----SHARED1-+-docs
| +-proj1
| +-proj2
|
+-tools---NANT
SHARED2_SLN-+-libs--+-SHARED1-+-docs
| | +-proj1
| | +-proj2
| |
| +-NLOG
|
+-misc----KEY
|
+-main----SHARED2-+-docs
| +-proj3
| +-proj4
|
+-tools---NANT
PROD_SLN----+-libs--+-SHARED1-+-docs
| | +-proj2
| | +-proj2
| |
| +-SHARED2-+-docs
| | +-proj3
| | +-proj4
| |
| +-NLOG
|
+-misc----KEY
|
+-main----PROD----+-docs
| +-proj5
| +-proj6
|
+-tools---NANT
In regards to point 3 above, if the dependency file use a format similar to .hgsub but with the addition of the rev/changeset/tag, then getting the dependencies could be automated. For example, I want SHARED1 in my new product. Clone SHARED1 to my libs folder and update to the tip or the last release label. Now, I need to look at the dependencies file and a) clone the dependency to the correct location and b) update to the specified rev/changeset/tag. Very feasible to automate this. To take it further, it could even track the rev/changeset/tag and alert the developer of there is dependency conflict between shared libs.
A hole remains if Alice is actively developing SHARED1 while Bob is developing PROD. If Alice updates SHARED1_SLN to use NLog v3.0, Bob may not ever know this. If Alice updates her dependency file to reflect the change then Bob does have the info, he just has to be made aware of the change.
I believe that this is a source control issue and not a something that can be solved with a dependency management tool since they generally work with binaries and only get dependencies (don't allow committing changes back to the dependencies). My dependency problems are not unique to Mercurial. From my experience, all source control tools have the same problem. One solution in SVN would be to just use svn:externals (or svn copies) and recursively have every component include its dependencies, creating a possibly huge tree to build a product. However, this falls apart in Visual Studio where I really only want to include one instance of a shared project and reference it everywhere. As implied by @Clare 's answer and Greg's response to my email to the hg mail list, keep components as flat as possible.
There is a better structure as I have laid out above. I believe we have a strong use case for using subrepos and I do not see a viable alternative. As mentioned in @Clare 's answer, there is a camp that believes dependencies can be managed without subrepos. However, I have yet to see any evidence or actual references to back this statement up.
Still open to better ideas...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With