Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify and potentially remove big binary commits inside an SVN repository?

Tags:

svn

fsfs

I am working with an SVN repository that is over 3 years old, contains over 6,100 commits and is over 1.5 GB in size. I want to reduce the size of the SVN repository (I'm not talking about the size of a full SVN export - I mean the full repository as it would exist on the server) before moving it to a new server.

The current repository contains the source code for all of our software projects but it also contains relatively large binary files of no significance such as:

  • Full installers for a number of 3rd party tools.
  • .jpg & .png files (which are unmodified exports of PSDs that live in the same folder).
  • Bin and Obj folders (which are then 'svn ignored' the next commit).
  • Resharper directories.

A number of these large files have been 'SVN deleted' since they were added, creating a further problem of identifing the biggest offenders.

I want to either:

  • Create a new SVN repository that contains only the code for all of the software projects - it is really important that the copied files maintain their SVN history from the old repository.
  • Remove the large binary commits and files from the existing repository.

Are either of these possible?

like image 515
InvertedAcceleration Avatar asked Feb 01 '10 13:02

InvertedAcceleration


People also ask

How does svn store binary files?

If Subversion determines that the file is binary, the file receives an svn:mime-type property set to application/octet-stream. You can always override this by using the auto-props feature or by setting the property manually with svn propset . Subversion treats the following files as text: Files with no svn:mime-type.

How do I see all commits in svn?

See the log command in the SVN Book. Show activity on this post. If you're using TortoiseSVN (on windows), then you can use the "Show log" function to see a list of all commits. In this dialog you can also open some statistics/graphs such as "number of commits per week" (for each user).

How do I view svn logs?

Examples. You can see the log messages for all the paths that changed in your working copy by running svn log from the top: $ svn log ------------------------------------------------------------------------ r20 | harry | 2003-01-17 22:56:19 -0600 (Fri, 17 Jan 2003) | 1 line Tweak.


1 Answers

Otherside is right about svnadmin dump, etc. Something like this will get you a rough pointer to revisions that added lots of data to your repo, and are candidates for svndumpfilter:

for r in `svn log -q | grep ^r | cut -d ' ' -f 1 | tr -d r`; do
   echo "revision $r is " `svn diff -c $r | wc -c` " bytes";
done

You could also try something like this to find revisions that added files with a particular extension (here, .jpg):

svn log -vq | egrep "^r|\.jpg$" | grep -B 1 "\.jpg$"
like image 92
Matt McHenry Avatar answered Nov 02 '22 08:11

Matt McHenry