Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it good practice to store binary dependencies in source control?

Over the years, I've always stored binary dependencies in the \lib folder and checked that into source-control with the rest of the project. I find I do this less so now that we have NuGet and NuGet Package Restore.

I've heard that some companies enforce a rule that no binaries can be checked into source control. The reasons cited include:

  1. Most VCS do not deal well with binaries - diffing and merging is not well supported
  2. Disk usage increases
  3. Commits and updates are slower
  4. The extra functionality, control and ease of use that a repository manager provides out of the box will be lost
  5. It encourages further bad practice; ideally projects should be looking to fully automate their builds, checking into version control is typically a manual step

Are there objective arguments for or against this practice for the vast majority of projects that use source-control?

like image 817
Steve Dunn Avatar asked Apr 16 '15 11:04

Steve Dunn


People also ask

Should I store binaries in git?

You should use Git LFS if you have large files or binary files to store in Git repositories. That's because Git is decentralized. So, every developer has the full change history on their computer.

Is GitHub a good source control?

GitHub is a web-based platform where users can host Git repositories. It helps you facilitate easy sharing and collaboration on projects with anyone at any time. GitHub also encourages broader participation in open-source projects by providing a secure way to edit files in another user's repository.


2 Answers

I would strongly recommend you to NOT use the practice that you describe (the practice of forbidding binaries in source-control). Actually I would call this an organizational anti-pattern.

The single most important rule is:

You should be able to check out a project on a new machine, and it has to compile out of the box.

If this can be done via NuGet, then fine so. If not, check in the binaries. If there are any legal/license issues, then you should have at least a text file (named how_to_compile.txt or similar) in your repo that contains all the required information.

Another very strong reason to do it like this is to avoid versioning problems - or do you know

  • which exact version of a certain library was in operation some years ago and
  • if it REALLY was the actual version that was used in the project and
  • probably most important: do you know how to get that exact version?

Some other arguments against the above:

  • Checking in binaries greatly facilitates build automation (and does not hinder it). This way the build system can get everything it needs from VCS without further ado. If you do it the other way, then there are always manual steps involved.
  • Performance considerations are completely irrelevant as long as you work in an intranet, and only of very minor relevancy when using a web-based repository (I suppose we're talking of no more than, say, 30-40 Megs, which is not really a big deal for today's bandwidths).
  • No functionality at all is lost. That's simply not true.
  • It's also not true that normal commits etc. are slower. This is only the case when dealing with the large binaries themselves, which usually happens only once.
  • And, if you have your binary dependencies checked in, you have at least some control. If you don't, you have none at all. And this surely has a much higher likelihood of errors...
like image 199
Thomas Weller Avatar answered Oct 15 '22 06:10

Thomas Weller


My own rule of thumb is there generated assets should not be version controlled (regardless of whether they're binary or textual). There are several things like images, audio/video files etc. which might be checked in and for good reason.

As for the specific points.

  1. You can't merge these kinds of files but they're usually just replaced rather than piecewise merged. Diffing them might be possible for some files using custom differs but in general, this is done using some kind of metadata like version numbers.

  2. If you had a large text file, disk usage is not an argument against version control. Same here. The idea is that changes to this file need to be tracked. In the worst case, it's possible to put these assets in a separate repository (that doesn't change very often) and then include it in the current one using something git submodules.

  3. This is simply not true. Operations on that specific file might be slower but that's okay. It would be the same for text files.

  4. I think having things in version control increases the convenience provided by the repo. manager.

  5. This touches on my point that the files in question shouldn't be generated. If the files are not generated, then checkout and build is one step. There's no "download binary assets" stage.

like image 43
Noufal Ibrahim Avatar answered Oct 15 '22 06:10

Noufal Ibrahim