How to properly store common code using CVS or SVN with a team

Question

For my day job, we have a CVS repository where we are supposed to store our code. The problem is, it's such a mess that no one wants to use it because over the years people have put things in there in so many different ways.

I am trying to convince my boss to let us start from scratch and do it right. Here's my question:

How do you store projects so that they are easy for a new person to come along and pull down the code, yet have a place for common code?

For example, should we be storing our code like this:

/repo
  /project1
  /project2
  /common (contains db connection classes, etc)

Now I want to add a new project. Should I create my new project and take what's in the common directory and copy and paste it into my new project and then upload that whole project as /project3 so that way if a new guys comes along, he can just check out project3 and have everything, or should I create my new project and have most of it linked to /project3 in CVS and then my common stuff linked to /common in CVS? The problem now being that if a new guy comes along, he has to spend days trying to figure out where all the code for project 3 is located in the repo

David W. · Accepted Answer

There are several ways to handle common stuff. One is to copy it over your repository. This is technically known as The Stinky Bad Way. The reason is quite simple: If you change the common module for one place, but don't do it in the other places. After a while, you don't have a common any more.

In Subversion, you can use svn:externals to automatically import common code across directories. This is technically called Depending upon a proprietary mechanism to manage code that doesn't work all that well. I've tried using svn:externals for years, and never got it working the way I want. The problem is that when I tag my code or create a branch, my svn:external links don't automatically move over.

For example, imagine I depend upon a common project stored in http://repos/svn/common. Because there are changes in common that are required in my project, we decide to create a 2.1 branch in common at http://repos/svn/common/branches/2.1, and my svn:externals will point there. After I finish my changes, I first have to create a http://repos/svn/common/tags/2.0 tag in commons, then I have to change my svn:external to point to this new URL, and then finally create my tag in my project. And, if I depend upon dozens of common projects, I'll have dozens of these externals to track.

The best way is to treat your common dependencies as pre-compiled third party libraries. If you use Java, they'll become .jar files. If you use C++, they'll become *.so or *.dll. You then store these compiled objects in a release repository, and during the build process, you can fetch the right version of these dependencies in each project.

The good news is that there's already an open source, reliable technology that does this, so you don't have to invent anything. The bad news is that it's Maven.

However, even if you aren't a Java shop, or you use Ant instead of Maven, you can still use the same mechanism that Maven uses to pull in your common pre-compiled dependencies.

You need to use a Maven release repository software package like Nexus or Artifactory. If you aren't a Java shop, you don't connect these repositories to the outside world. Simply use them to store your releases.

During the build process, you download the dependencies using either standard wget or curl or Ivy, if you're using Ant, or if your using Maven, Maven handles this automatically.

To upload the artifacts during the build, you can use the Maven deploy:deploy-file plugin.

This last way is the trickiest to setup, but is well worth the effort. You now know your dependencies, and the version of that dependency. You also have everything only stored once in your source repository since you're not copying source all around. And, compiled code shouldn't be stored in your repository anyway.

Reply to Rob Napier

+1 for excellent discussion, but a warning: using pre-built object files is very hard when all the pieces are under development at the same time. It works well when those .jar/.so/.dll files are fairly stable and static, and somewhat when they have dedicated teams maintaining them. But if you're developing all the parts together, and your team does not have a strong commitment to reuse, my experience is that The Stinky Bad Way is what still works best for the code that changes a lot. Ease into better reuse with the pieces that very seldom change, and then expand reuse as you learn and mature. – Rob Napier

The Sticky Bad Way (SBW) is the easiest way to do components. This is especially true if you're creating your component code while you're creating your programs that use the code. The problem is that writing the initial program is only 10% of programming. The other 90% is maintaining that program and making sure it remains relevant. That's the hardest part of programming.

Imagine if I decide to use Amazon's S3 service for storage, and I write what you could call an API or maybe a driver to work between my program and S3. Let's say you call it The Foundation Package, and all of your programs will use it.

The simplest thing to do is the SBW -- just copy the Foundation code to each module. If there's a problem or a module needs a new feature that's not in Foundation, I can just modify the Foundation until it does what I want.

This works out great for a few years. Then, Amazon announces a new API and that the old API will be deprecated. Not only that, the new API has features your customers want. Now, you have a problem.

You have this problem because this Foundation has no real owner. The separate development teams never really took the time to learn the code. If there was a change, they employed the HAIUIW ¹ method of development. Now, you have a half dozen separate and incompatible Foundation modules that no one really understand.

The SBW isn't an issue if your developers understand from the very start that the code they're using isn't a common module, but is part of their code. They'll learn how it works and use it as such. But, you're not going to get the benefits of having a common module.

In programming, 10% involves writing that first bit of code. The other 90% is attempting to maintain that code. We've learned long ago that we try to find errors as early as possible when we code. In many programming paradigms, we learn to write tests and documentation first, then code. It's hard and it's not fun. It makes doing that first 10% really, really hard to do. I could write it in a couple of days, but now I'm spending a couple of weeks doing all of this thinking.

Yet, we know that doing this makes doing the other 90% of our programming job much, much easier. The same is true with components. It is so easy to copy the code from one place to another and HAIUIT. It is much more difficult to create a separate component with its own team. That team must work with the other development teams. There will be arguments, conflicts, and shouting matches. People will call each other names. Each group has its own goals. Now, imagine doing this while attempting to setup the release repository and creating the whole infrastructure to get everything working. That first 10% is really difficult to do.

But, when Amazon makes that announcement for their new API, or you find that you could greatly increase sales if you could get your software working with Microsoft Azure too, making that separate component makes doing that other 90% much, much easier.

¹ Hack At It Until It Works.

Rob Napier · Answer

First, I strongly discourage anything that includes "let us start from scratch and do it right." In the vast majority of cases, it is better to pick a new direction and migrate towards it over time. This requires some commitment from the team, but all code-reuse requires commitment from the team.

Whether you should clone common classes or not depends very heavily on your team again. Sharing common code in a common directory means that anytime you change the common code, you potentially have to fix every project. This often puts a very high barrier on improving the common code, unless you have a team that is highly committed to managing common (typically this means having a separate group who does only this).

If you are like many teams, you will find that this becomes a mess in practice if your number of projects are large, and your commitment to code reuse is weak to moderate (this being the common case).

This leads you towards cloning the common code for each project, which will quickly fork the common code, making it not "common" at all. So what's a team to do?

The first step is total honesty with yourself and your team. Is your team actually committed to code reuse? It can't be one or two guys. It has to be the team, including the manager. No process will change your culture. You cannot just say "we'll reuse code because we have to" and expect it to happen. Nothing happens that way.

Instead, I recommend taking easy gains as you can get them. Identify what's common, and within that what's really stable and seldom changes. That's the stuff to put into common. For SVN, use svn:external for this so that a single checkout gets everything (svn:external is a royal pain if the code changes a lot, but we agreed above that this is for code that seldom changes). For CVS, switch to SVN. :D You may be forced in CVS to create a "checkout-dependencies" script in the top of the project directory. I hate those, but if they're consistent, then it's not the worst possible solution. Really stable code can be checked-in as libraries to speed up build times.

Stuff that is going to change a lot, and you're not committed to keeping shared, clone it into the projects. Ideally you should be talking among the projects to that you can integrate fixes, pick up new versions from each other, etc. If you don't talk to each other, then the common code will fork. (But if you can't even talk to each other informally, then sticking it all in one directory won't "make" you talk. It'll just cause it to explode.)

If, after doing this for a while you discover that hey, you guys actually talk to each other a lot and keep sharing code and the integration has become a hassle, then you'll have justification to task specific people with maintaining the common code full time.

But no matter which way you go, your gut feeling that developers should be able to just "checkout, build, run" without having to read three wiki pages and ask six people is absolutely correct. That's critical IMO.

How to properly store common code using CVS or SVN with a team

Tags:

version-control

svn

Catfish

2 Answers

Reply to Rob Napier

David W.

Rob Napier

Recent Activity

Donate For Us

How to properly store common code using CVS or SVN with a team

Tags:

version-control

svn

Catfish

2 Answers

Reply to Rob Napier

David W.

Rob Napier

Related questions

Recent Activity

Donate For Us