Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I reduce the size of a bloated Git repo by non-interactively squashing all commits except for the most recent ones?

My Git repo has hundreds of gigabytes of data, say, database backups, so I'm trying to remove old, outdated backups, because they're making everything larger and slower. So I naturally need something that's fast; the faster, the better.

How do I squash (or just plain remove) all commits except for the most recent ones, and do so without having to manually squash each one in an interactive rebase? Specifically, I don't want to have to use

git rebase -i --root

For example, I have these commits:

A .. B .. C ... ... H .. I .. J .. K .. L

What I want is this (squashing everything in between A and H into A):

A .. H .. I .. J .. K .. L

Or even this would work fine:

H .. I .. J .. K .. L

There is an answer on how to squash all commits, but I want to keep some of the more recent commits. I don't want to squash the most recent commits either. (Especially I need to keep the first two commits counting from the top.)

(Edit, several years later. The right answer to this question is to use the right tool for the job. Git is not a very good tool to store backups, no matter how convenient it is. There are better tools.)

like image 207
sanmai Avatar asked Jun 11 '14 02:06

sanmai


People also ask

How do I make my repository smaller?

remove the file from your project's current file-tree. remove the file from repository history — rewriting Git history, deleting the file from all commits containing it. remove all reflog history that refers to the old commit history. repack the repository, garbage-collecting the now-unused data using git gc.

How do I reduce the size of a .pack file in Git?

When you do a Git clone, it will create a copy of the whole repository, this includes the pack file as this is part of the repo too. The only way to reduce the size of the pack file will be by removing contents from your repo.


1 Answers

The original poster comments:

if we take a snapshot of a commit 10004, remove all commits before it, and make commit 10004 a root commit, I'll be just fine

One way to do this is here, assuming your current work is called branchname. I like to use a temp tag whenever I do a large rebase to double-check that there were no changes and to mark a point I can reset back to if something goes wrong (not sure if this is standard procedure or not but it works for me):

git tag temp

git checkout 10004
git checkout --orphan new_root
git commit -m "set new root 10004"

git rebase --onto new_root 10004 branchname

git diff temp   # verification that it worked with no changes
git tag -d temp
git branch -D new_root

To get rid of the old branch you'll need to delete all tags and branch tags on it; then

git prune
git gc

will clean it from your repo.

Note that you'll temporarily have two copies of everything, until you have gc'd, but that is unavoidable; even if you do a standard squash and rebase you still have two copies of everything until the rebase finishes.

like image 132
M.M Avatar answered Oct 06 '22 01:10

M.M