Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How exactly does subversion store files in the repository?

I read the subversion book and it is clear to me that subversion does not store individual files but only deltas in order to minimize disk space. Subversion also does the same with binary files as well (this used to be a huge weakness of CVS).

However I do not understand the exact mechanism. When I commit a file what happens?

  1. Subversion stores only the diff (and already has the old version)
  2. Subversion deletes the previous version, stores the new file intact and creates a reverse diff in order to "re-create" the old version if needed.
  3. Something else that I haven't thought of.

The first case might seem the most logical. This however raises another question. If I have in a subversion repository a file with 1000 commits and a new developer checks out a clean copy, then subversion would have to fetch the original version (initial import) and apply 1000 diffs on this before returning the result. Is this correct? Is there some sort of caching for files where the latest version is kept as well?

Basically where can I find information on the svn repository internals?

Update: Apparently the backend of subversion plays a big role in this. At the time or writing FSFS uses option 1 while BDB uses option 2. Thanks msemack!

like image 853
kazanaki Avatar asked Feb 25 '10 09:02

kazanaki


People also ask

Where does svn store data?

SVN stores the versioned files in a database, not in the filesystem. So, it's all in there under the db directory. The specific database SVN uses is BerkeleyDB. More information about SVN's backend storage can be found in the free online Subversion book.

What is the purpose of an svn Subversion repository?

SVN stands for Subversion. So, SVN and Subversion are the same. SVN is used to manage and track changes to code and assets across projects.

Does Subversion use a database?

For storing the repository contents, Subversion uses its own FSFS database. It's not a database in the relational database sense. It's a filesystem-based method of storing repository contents. For some server-side functionality, and for storing working copy metadata on the client end, it uses SQLite.

How does svn work?

Apache Subversion (often abbreviated SVN, after its command name svn) is a software versioning and revision control system distributed as open source under the Apache License. Software developers use Subversion to maintain current and historical versions of files such as source code, web pages, and documentation.


1 Answers

Because Subversion's repository format is entirely internal, they are free to change the representation from one revision to the next. I believe the current revision generally stores reverse deltas (your option 2), but also stores complete snapshots periodically so it doesn't have to resolve 1000 diffs before returning a result.

The Subversion 1.6 release notes has a section on Filesystem storage improvements that has some notes on this, and links to other sources. Suffice to say that the details of Subversion data storage are complex and subject to change.

There is also a design document in the Subversion source tree that describes the use of skip deltas in Subversion. Generally, the /notes/ directory contains several useful documents regarding Subversion internals.

like image 108
Greg Hewgill Avatar answered Oct 05 '22 20:10

Greg Hewgill