I have about a dozen repositories that on the file system are 1 GB to 10 GB in size, and I need to set up automated backups for all of them (our old backup scripts got lost when a computer went down) with our XP 64-bit machines.
After reading this question about the best way to back up SVN repos, I started dumping the biggest repo we have, which is about 13 GB. This command has been executing for ~2.5 hours now, and it's currently dumping revision ~200 of 300+.
svnadmin --deltas \\path\to\repo\folder > \\path\to\backup\folder\dump.svn
The dump file is over 100 GB and counting. I know I can 7-zip this sucker, but 100 GB?! ... o_O
The repositories contain a large amount of binary data, which could be part of the problem, but as of right now, switching to a more efficient version control system (assuming there is one) is not realistic; SVN is a part of life here.
I've considered using hotcopy, which takes up a lot less space, but I tried using one of our old hotcopy-ed backups, and subversion 1.7 couldn't find a bunch of files it needed. It seems that I'd have to install the version of SVN that originally hotcopy-ed the repo, and dump that repo to get it into a newer SVN. This statement seems to verify the problem I'm having with hotcopy: http://svn.haxx.se/users/archive-2005-05/0842.shtml
I feel like I've just got to be missing something. Maybe there's some flag for dump that magically makes the dump 1/5 the size...
Do I have any other options?
UPDATE: The last revision, #327, was just dumped. The final size of the dump file is 127 GB. That's from a 13.5 GB repo. I have probably roughly 3X that much in all of my repositories combined.
Description. Dump the contents of the filesystem to stdout in a “dump file” portable format, sending feedback to stderr . Dump revisions LOWER rev through UPPER rev. If no revisions are given, dump all revision trees.
For daily backup I would say you really don't need to do an svnadmin dump
. I would use the dump method if you were about to transfer the repository to a new server which may be running a different SVN version and OS as it's the most portable way of dumping the repository, but it's not very space-efficient.
I'd recommend using the hotcopy methods referred to that link. That will guarantee that the state of the filesystem is consistent, and will also copy the configuration files and hook scripts (incidentally the svnadmin dump doesn't copy these, so you'll end up with an incomplete backup). Because it's just a direct copy of the repository, it's the same size so the backup should be much more manageable.
In an emergency, if you need to restore a backup done from a hotcopy then all you should need are a machine with the same major version of SVN (e.g. 1.6 or 1.7) and to be safe, the same OS. You should be then able to use this repository directly, or you can do an svnadmin dump
at this point to transfer to a new server.
EDIT: comparison of svnsync and hotcopy:
common aspects:
Advantages of hotcopy:
Advantages of svnsync:
Thanks to the suggestions of bahrep and the_mandrill, I decided to go with svnsync for these repositories. I was able to get it set up quite easily, and since we don't have any hooks or config files, there's nothing else to back up. Because of the problems I had with hotcopy (thanks to the_mandrill for proposing a solution to these issues) I decided that svnsync would be the simpler solution for us.
In addition to what the_mandrill pointed out, svnsync has other advantages:
To set up svnsync, I had to complete the following steps. Excuse any typos. All of our repositories are hosted using VisualSVN Server.
Create a new, empty repository:
svnadmin create \\computerB\C$\repositories\mirror
Create the file, \mirror\hooks\pre-revprop-change.bat
. It's only content is this one line:
exit 0
Initialize the sync
svnsync init https://computerB.domain.net/svn/mirror https://computerA.domain.net/svn/repo
Synchronize the two repos
svnsync synchronize https://computerB.domain.net/svn/mirror https://computerA.domain.net/svn/repo
Beginning with VisualSVN Server 3.6, you can use Backup-SvnRepository
PowerShell cmdlet to make a backup of Subversion repository. To restore the
repository from backup, use Restore-SvnRepository
cmdlet.
What is more, the Enterprise Edition of the server offers a scheduled backup feature. The built-in scheduled backup supports several backup types including incremental backups that are efficient in terms of storage space and time required to take the backup.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With