Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How good is Subversion at storing lots of binary files? [closed]

People also ask

How does Subversion handle binary files?

If Subversion determines that the file is binary, the file receives an svn:mime-type property set to application/octet-stream. You can always override this by using the auto-props feature or by setting the property manually with svn propset . Subversion treats the following files as text: Files with no svn:mime-type.

Should binary files be stored in git?

You should use Git LFS if you have large files or binary files to store in Git repositories. That's because Git is decentralized. So, every developer has the full change history on their computer.

Is SVN better than Git?

SVN is better than Git for architecture performance, binary files, and usability. And it may be better for access control and auditability, based on your needs.

How are binary files stored?

Serialisation is the process of converting an object (such as a dictionary of data) into binary sequences that can be stored in a file. When the file is accessed, the binary data is retrieved from the file and deserialised into objects that are exact copies of the original information.


In my previous company we setup Subversion to store CAD files. Files upto 100 MB were stored in Subversion. If many people 'add' big files to Subversion webserver can be a bottleneck. However, incremental commits were perfectly ok.

Subversion stored 'binary delta'. In fact, on server side, binary and text files are treated exactly same in storing the 'delta'. Check "binary delta encoding improvements' section on page http://subversion.tigris.org/svn_1.4_releasenotes.html. It explicitly says "Subversion uses the xdelta algorithm to compute differences between strings of bytes" (and not strings of 'characters').

Just for experiment, I stored the 10 version of CAD (CATIA part file). Each version I made minor modifications to part and then check the serverside repository size. The total size was about 1.2x for about 10 revision (x - being the original file size).

Remember to set svn:needs-lock property. In my experience, Best way is to use 'auto props' to set the svn:needs-lock based on file extension.


There's a difference between lots of big binary files, and a big number of binary files.

In my experience SVN is fine with individual binary files of several hundred megabytes. The only problems I've seen begin to occur with individual files of around a gigabyte or so. Operations fail for mysterious and unknown reasons, possibly SVN failing to handle network related problems.

I am not aware of any SVN problems related to the number of binary files, beyond their lack of merge-ability and the fact that binary files often can't be efficiently stored as deltas (SVN can use deltas).

So;

  • 1000 1MB files = fine.
  • 100 10MB files = fine
  • 10 100MB files = fine
  • 1 >1000MB file = not a good idea.

I would hope the size of your documents fits into one of the fine categories :)


We built our subversion client exactly for this, as we did really big design/consulting jobs that really needed version control. We never had any problems with it.


It depends on how often the files are updated. It can't do anything about merging binary files and so everytime there's a conflict you'll have pain. Otherwise it's just storage and retrieval, and while it's not as good as with text it still handles that just fine.