Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for a single large SVN project

I have inherited a single project in svn: 30Gb in over 300 000 files. There are tons of binary files in there mostly in an images folder. Operations like updating the entire project can be dramatically slow.

The team has evolved a process to only run update/switch on the specific folders they are working on and end up checking in broken code because "it works on my computer". Any one person's working copy can include out-of-date code, switched code, and forgotten-never-committed code. Also, minimal branching takes place.

My personal solution is a small bash checkout/build script at 5am every morning, however not everyone has the command line courage to even copy my solution and would rather the comfort of tortoise svn and the broken process.

Has anyone tried to tune such a large repository and can give advice? Are there any best practices I can implement for working with large repositories that I can ease everyone into?

P.S. externals don't seem to be a good idea and SVN optimizations to keep large repositories responsive doesn't apply here because I am dealing with a single project

P.P.S. This is currently being looked into also: http://www.ibm.com/developerworks/java/library/j-svnbins.html

like image 941
Talesh Avatar asked Apr 14 '09 20:04

Talesh


1 Answers

Firstly, upgrade to SVN 1.6 on both client and server. The latest release notes mention a speedup for large files (r36389).

Secondly, this may not be too appropriate for you if you have to have the entire project in your working copy, but use sparse directories. We do this for our large repo, the first thing a client does is to checkout the top level directory only, then to get more data, use the repo browser to go to the desired directory and "update to this revision" on it. It works wonderfully on TortoiseSVN. 1.6 also has the 'reduce depth' option to remove directories you no longer need to work on.

If this isn't for you, you can still do an update on parts of the working copy. Update tends to be slow the more files you have (on Windows that is, NTFS seems to be particularly poor with the locking strategy used for updating. Bert Huijben noticed this and suggested a fix - TBA with the 1.7 release, but you could rebuild your current code with his 'quick fix'.

An alternative could be to change your filesystem, if you can reformat, you could try the ext2 IFS driver, but I'm sure you'd be cautious of that!

Last option - turn off your virus scanner for .svn firectories, and also for the repository on the server. If you're running Apache on the server, make sure you have keep alives on for a short time (to prevent re-authentication from occurring). Also turn off indexing on your working copy directories and shadow copy too. (the last doesn't help much, but you may see a better improvement that I did, turning AV off on the server boosted my SVN response 10x though).

like image 88
gbjbaanb Avatar answered Oct 03 '22 19:10

gbjbaanb