Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove old commit information from a git repository to save space

Tags:

git

I have a repository for storing some large binary files (tifs, jpgs, pdfs) that is growing pretty large. There is also a fair amount of files that are created, removed, and renamed and I don't care about the individual commit history. This question is somewhat simplified because I'm dealing with a repository that has no branches and no tags.

I'm curious if there's an easy way to remove some of the history from the system to save space.

I found an old thread on the git mailing list but it doesn't really specify how to use this (i.e. what the $drop is):

git filter-branch --parent-filter "sed -e 's/-p $drop//'" \
        --tag-name-filter cat -- \
        --all ^$drop 
like image 490
greggles Avatar asked Oct 12 '12 18:10

greggles


2 Answers

I think, you can shrink your history following this answer:

How to delete a specific revision of a github gist?

Decide on which points in history, you want to keep.

pick <hash1> <commit message>
pick <hash2> <commit message>
pick <hash3> <commit message>   <- keep
pick <hash4> <commit message>
pick <hash5> <commit message>
pick <hash6> <commit message>   <- keep
pick <hash7> <commit message>
pick <hash8> <commit message>
pick <hash9> <commit message>
pick <hash10> <commit message>  <- keep

Then, leave the first after each "keep" as "pick" and mark the others as "squash".

pick   <hash1> <commit message>
squash <hash2> <commit message>
squash <hash3> <commit message>   <- keep
pick   <hash4> <commit message>
squash <hash5> <commit message>
squash <hash6> <commit message>   <- keep
pick   <hash7> <commit message>
squash <hash8> <commit message>
squash <hash9> <commit message>
squash <hash10> <commit message>  <- keep

Then, run the rebase by saving and quitting the editor. At each "keep" point, the message editor will pop up for a combined commit message ranging from the previous "pick" up to the "keep" commit. You can then either just keep the last message or in fact combine those to document the original history without keeping all intermediate states.

After that rebase, the intermediate file data will still be in the repository but now unreferenced. git gc will now indeed get you rid of that data.

like image 114
Tilman Vogel Avatar answered Oct 02 '22 16:10

Tilman Vogel


You could always just delete .git and do a fresh git init with one initial commit. This will, of course, remove all commit history.

like image 25
ezod Avatar answered Oct 02 '22 15:10

ezod