Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Version control for large binary files and >1TB repositories?

Sorry to come up with this topic again, as there are soo many other questions already related - but none that covers my problem directly.

What I'm searching is a good version control system that can handle only two simple requirements:

  1. store large binary files (>1GB)
  2. support a repository that's >1TB (yes, that's TB)

Why? We're in the process of repackaging a few thousand software applications for our next big OS deployment and we want those packages to follow version control.

So far I've got some experience with SVN and CVS, however I'm not quite satisfied with the performance of both with large binary files (a few MSI or CAB files will be >1GB). Also, I'm not sure if they scale well with the amount of data we're expecting in the next 2-5 years (like I said, estimated >1TB)

So, do you have any recommendations? I'm currently also looking into SVN Externals as well as Git Submodules, though that would mean several individual repositories for each software package and I'm not sure that's what we want..

like image 971
Christoph Voigt Avatar asked Mar 08 '11 15:03

Christoph Voigt


People also ask

Can version control systems store binary files?

Most version control systems that do handle binary assets have to store them separately. Helix Core stores binaries alongside source code in the same depot. You can easily find and update the file you need, allowing you to streamline your build process.

How does Subversion handle binary files?

If Subversion determines that the file is binary, the file receives an svn:mime-type property set to application/octet-stream. You can always override this by using the auto-props feature or by setting the property manually with svn propset . Subversion treats the following files as text: Files with no svn:mime-type.

What is the repository in a version control system?

In version control systems, a repository is a data structure that stores metadata for a set of files or directory structure.

When should you use Git LFS?

Git LFS can be used when you want to version large files, usually, valuable output data, which is larger than Github limit (100Mb). These files can be plain text or binaries.


2 Answers

Take a look at Boar, "Simple version control and backup for photos, videos and other binary files". It can easily handle huge files and huge repositories.

like image 87
Mats Ekberg Avatar answered Sep 19 '22 18:09

Mats Ekberg


Old question, but perhaps worth pointing out that Perforce is in use at lots of large companies, and particular in games development companies, where multi-Terabyte repositories with many large binary files.

(Disclaimer: I work at Perforce)

like image 27
Robert Cowham Avatar answered Sep 19 '22 18:09

Robert Cowham