I'm starting a project using git where I'll be committing very large files, but only a few times a week. I've tried to use git as-is and it seems to store the entire file in each commit where it is changed. This will not work for this project, the repository would grow out of control. So, I want to reduce the size of the repository.
My first thought was to "simply" remove all commits older than say two weeks, or only keep e.g. five commits in the history (this is probably better :)) I've googled and read a lot from The Git Community Book and I guess I'm gonna need to work with git-rebase
or git-filter-branch
. The thing is I just can't seem to get it to work.
Just to illustrate; I have a history H with only one branch (The master branch)
A --> B --> C --> D --> E
I want to remove some previous commits to make my history look like
C --> D --> E
Commits A and B should be completely purged. I've tried git-rebase
but it seems to merge commits together rather than actually removing old ones, maybe I don't fully understand how rebase works.. Another thought I had was to remove everything from .git/objects and then build a new commit using git-hash-object -w
, git-mktree
and git-commit-tree
, I have not yet managed to push this "artificial" tree to the server though.
I won't be working with any branches, so there's no need taking these into account.
What I'm wondering is if anyone can give me concrete usages of git-rebase
if that's what I'm supposed to use? Or some other tips, examples of what I can do.
Cheers!
Edit:
The large files will not be the same large files all the time, and some files will be replaced by new files. I want these replaced files to be completely purged from the history.
If you want to remove the "bad" commit altogether (and every commit that came after that), do a git reset --hard ABC (assuming ABC is the hash of the "bad" commit's elder sibling — the one you want to see as the new head commit of that branch). Then do a git push --force (or git push -f ).
The easiest way to delete a file in your Git repository is to execute the “git rm” command and to specify the file to be deleted. Note that by using the “git rm” command, the file will also be deleted from the filesystem.
This should be a simple git rebase -i
where you have
p A s B s C p D p E
and then edit the commit message for A-C to be just C's commit message.
git-rebase will "squash" all the commits into a single commit, who's objects are the same as commit C's objects.
Note: It may be possible to use git filter-branch
to change the big files in the previous commits to actually match the new ones, if you'd rather do that. But its a dangerous operation and I don't want to give you a bad command on accident.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With