Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Version control best practices

I just made the move to version control the other day, and after a bad experience with Subversion, I switched to Mercurial, and so far am happy with it.

Although I understand and appreciate the idea of version control, I don't really have any practical experience with it.

Right now, I am using it for a couple websites I am working on, and a couple questions have come to mind:

  • When/how often should I commit? After any major change, whether it works or not? When I'm done for the night? Only when it reaches it's next stable iteration? After any bugfixes?
  • Would I branch off when I wanted to, say, change the layout of a menu, then merge back in?
  • Should I branch? What is the difference (for just me, a lone developer) between branching, then merging back in, and cloning the repository and pulling it back in?

Any other advice for a version control newbie?


So far, everyone has given me good advice, but very team-oriented. I would like to clarify:

At the moment, I am just using VC on some websites I do on the side. Not quite full-out freelance work, but for the purposes of VC, I am the only one that really touches the website code.

Also, since I am using PHP on the sites, there is no compiling to be done.

Does this change your answers significantly?

like image 465
Austin Hyde Avatar asked Jan 03 '10 18:01

Austin Hyde


People also ask

What are the three types of version control?

The types of VCS are: Local Version Control System. Centralized Version Control System. Distributed Version Control System.

What should you keep under version control?

tl;dr you should put most things that relate to your program into version control, excluding dependencies (things like libraries, graphics and audio should be handled by some other dependancy management system).

What are the basic concepts of version control?

Version control, also known as source control, is the practice of tracking and managing changes to software code. Version control systems are software tools that help software teams manage changes to source code over time.


2 Answers

Most of the questions you're asking about depends mostly on who you are working with. If you're a lone developer it shouldn't matter a lot, since you can do whatever you'd like. But if you're in a team where you have to share your code then you should discuss with your team members what the code of conduct should be since sharing changes between one another can become tricky at times.

The discussion regarding code of conduct doesn't need to be lengthy, it can be very brief; as long everyone is on the same page on how to use the repository that is shared between the programmers in the team. If you want to use the more advanced features in Mercurial, such as cherry picking or patch queues, then try using them so that it won't impact your team members in a negative way, such as rebasing on a public repository.

Remember version control has to be easy to use for everyone in the team, or else it won't be used.

When/how often should I commit? After any major change, whether it works or not? When I'm done for the night? Only when it reaches it's next stable iteration? After any bugfixes?

While working with a team there are several approaches, but the common rule is to commit early and often. The main reason on why you should commit often is to make merge conflicts easier to handle.

A merge conflict is simply put whenever merging a file that has been changed by at least two people doesn't work because they've been editing on the same lines. If you're holding on to a commit that involves a very large change with several lines of changes across several files, it will become very difficult to manage for the receiver to manage the conflicts that may occur. The merge conflict becomes even more difficult to handle if the said set of changes are held on for too long.

There are some exceptions to the rule of committing often and one is whenever you have a breaking change. although if you have the ability to commit locally (which you are doing in Mercurial and git inherently) you could commit breaking changes. As long as you fix whatever broke, you should push it upstream to the shared repository when you've fixed your own breaking change.

Would I branch off when I wanted to, say, change the layout of a menu, then merge back in? Should I branch?

There are many branching strategies to choose from (there is the Streamed Lines paper from 1998 that has an exhaustive pattern list of branching strategies) and when you're making them for yourself it should be open game for yourself. However when working in teams, you'd better discuss openly with the team if you need to branch or not. Whenever you have the urge to branch though you should ask yourself the following questions:

  • Will my future changes be breaking the work of others?

  • Will my team have a direct negative impact from the changes I'll be doing until I'm done?

  • Is my code throwaway code?

If the answer is yes in any of the questions above you should probably branch publically, or keep it for yourself (since you can do that in Mercurial in several ways). You should first discuss with your team on how to execute the whole endavour to see if there is any other way of doing it and if you're going to merge your changes back in, sometimes there are factors at play where there is no need to branch (this is mostly related to how modular the code is).

When you decide to branch be prepared to handle a merge conflict. It is sane to assume the one who created the branch and made the commits to be able to merge it back into the "main branch". At these times it would be great if everyone in the team made relevant commit comments.

As a side note: You do write good commit comments, right? RIGHT!? A good commit comment usually tells why that particular change was made or what feature the committer was working on instead of a nondescript "I made a commit" kind of comment. This makes it easier for the one who is handling the big merge conflict to figure out what line changes can be overwritten and which ones to keep while going through the revision history.

Compile times, or build times rather, sometimes play into the branch discussion you may have. If your project has a slow build time then it might be a good idea to use a staging strategy in your branches. This strategy takes into account that all developers should integrate to a "main line" and changes that are approved are elevated (or "promoted") to the next stage, such as testing or release lines. It is classically illustrated with tag names for open source software like this:

main -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-> ...          \           \              \ test      o-----------o--------------o---------> ...            1.0 RC1     \ 1.0 RC2      2.0 RC1 release                 o----------------------> ...                           1.0 

The point with this is that testers can work without being interrupted by the programmers and that there is a known baseline for those who are in release management. In distributed version control, the different lines could be cloned repositories and it may look a bit different since repositories share the versioning graph. The principle however is the same.

Regarding web development, there are virtually no build times. But branching in stages (or by tagging your release revisions) it becomes easier to roll-back if you want to check a difficult-to-track-down bug.

However, a whole other thing comes into play and that is the time it takes to deploy the site. Version control tools in my experience are really bad at asset management. Handling art assets that are in total up to several GB usually is a huge pain in the butt to handle in Subversion (more so in Mercurial). Assets may require you to handle them in another way that is less time consuming, such as putting them in a shared space that are synched and backed up in a traditional manner (art assets are usually not worked on concurrently as with source code files).

What is the difference (for just me, a lone developer) between branching, then merging back in, and cloning the repository and pulling it back in?

The concepts of branching and keeping remote repositories are closer now than with centralized version control tools. You could almost consider them being the same thing. In Mercurial (and in git) you can "branch" either by:

  • Cloning a repository

  • Creating a named branch

Creating a named branch means that you're making a new path in the versioning graph for the repository you're creating it on. Creating a cloned repository means you're copying the source repository into a new location, and making a new path in the cloned repository's versioning graph. They are both two different implementations of branching as a general concept in version control.

In practice, the only difference between both methods that you should care about is in usage. You clone a repository to have a copy of the source code and have a place to store your own changes in and you create named branches whenever you want to do small experiments for yourself.

Since browsing through branches is a bit quirky for those who accustomed to a straight line of commits, advanced users know how to manipulate their versions so the version history is "clean" with e.g. cherry picking or rebase. At the moment git docs actually explain rebase rather well.

like image 83
Spoike Avatar answered Oct 12 '22 07:10

Spoike


These are the practices that I follow

  • Each commit should make sense: one bug fix (or a set of bugs related to each other), one (small) new feature, etc. The idea is that if you need to rollback, your rollbacks fall on well defined "boundaries"

  • Every commit should have a good message explaining what you are committing. Really get into this habit, you will thank yourself later. Doesn't have to be verbose, a few sentences can do. If you are using a bug tracking system, associating a bug number with your commit is also extremely helpful

  • Now that I use git and branching is so incredibly fast and cheap, I tend to make a new branch for each new feature I'm about to implement. I'd never even consider doing this for many other VCSes. So branching depends on the system you are using, your codebase, your team, etc, there are no hard rules there.

  • I prefer to always use the command line and get to know my VCS's commands directly. The disconnect that a GUI based frontend can cause can be a pain, and even damaging. Controlling your source code is very important, it's worth getting in there and doing it directly. But that's just my preference.

  • Back up your VCS. I back up my local repository with Time Machine, and then I push out to a remote repository on my server, and that server is backed up as well. VCS alone is not really a "backup", it can go down too just like anything else.

like image 23
Matt Greer Avatar answered Oct 12 '22 07:10

Matt Greer