I can see why distributed source control systems (DVCS - like Mercurial) make sense for open source projects.
But do they make sense for an enterprise? (over a centralized Source Control System such as TFS)
What features of a DVCS make it better or worse suited for an enterprise with many developers? (over a centralized system)
A distributed version control system (DVCS) is a type of version control where the complete codebase — including its full version history — is mirrored on every developer's computer. It's abbreviated DVCS. Changes to files are tracked between computers.
A distributed version control system doesn't require an internet connection, so most development, except pushing and pulling, can be done while traveling or away from home or an office. Contributors can view the running history on their hard drive, so any changes will be made in their own repository.
DVCS is faster than CVCS because you don't need to communicate with the remote server for each and every command. You do everything locally which gives you the benefit to work faster than CVCS. Working on branches is easy in DVCS.
AWS CodeCommit is a managed DVCS option in the public cloud. Like most Amazon cloud services, it's built on a secure and scalable system; when you need more server space, you can add it. Similar to Git, CodeCommit works anywhere, so developers can collaborate using multiple servers within a project space.
I have just introduced a DVCS (Git in this case) in a large banking company, where Perforce, SVN or ClearCase was the centralized VCS of choices:
I already knew of the challenges (see my previous answer "Can we finally move to DVCS in Corporate Software? Is SVN still a 'must have' for development?")
I have been challenged on three fronts:
centralization: while the decentralized model has its merits (and allows for private commits or working without the network while having access to the full history), there still needs to be a clear set of centralized repos, acting as the main reference for all developers.
authentication: a DVCS allows you to "sign-off" (commit) your code as... pretty much anyone (author "foo
", email "[email protected]
").
You can do a git config user.name foo
, or git config user.name whateverNameIFeelToHave
, and have all your commits with bogus names in it.
That doesn't mix well with the unique centralized "Active Directory" user referential used by big enterprises.
authorization: by default, you can clone, push from or pull to any repository, and modify any branch, or any directory.
For sensitive projects, that can be a blocking issue (the banking world is usually very protective of some pricing or quants algorithms, which require strict read/write access for a very limited number of people)
The answer (for a Git setup) was:
.
pull
(read) through http, but also push
(write) through http.The authentication part is also reinforced at the Git level by a post-receive
hook which makes sure that at least one of the commits you are pushing to a repo has a "committer name" equals to the user name detected through shh or http protocol.
In other words, you need to set up your git config user.name
correctly, or any push you want to make to a central repo will be rejected.
.
The gitolite perl script will parse a simple text file where the authorizations (read/write access for a all repository, or for branches within a given repository, or even for directories within a repository) have been set.
If the access level required by the git command doesn't match the ACL defined in that file, the command is rejected.
The above describes what I needed to implement for a Git setting, but more importantly, it lists the main issues that need to be addressed for a DVCS setting to make sense in a big corporation with a unique user base.
Then, and only then, a DVCS (Git, Mercurial, ...) can add values because of:
data exchange between multiple sites: while those users are all authenticated through the same Active Directory, they can be located across the world (the companies I have worked for have developments usually between teams across two or three countries). A DVCS is naturally made for exchanging efficiently data between those distributed teams.
replication across environments: a setting taking care of authentication/authorization allows for cloning those repositories on other dedicated servers (for integration testing, UAT testing, pre-production, and pre-deployment purposes)
process automation: the ease with which you can clone a repo can also be used locally on one user's workstation, for unit-testing purposes with the "guarded commits" techniques and other clever uses: see "What is the cleverest use of source repository that you have ever seen?".
In short, you can push to a second local repo in charge of:
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With