Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git - Squash All Commits in History Before Specific Commit

Tags:

git

I have a Mercurial repo that I am converting to Git. The commit history is quite large and I do not need all of the commit history in the new repo. Once I convert the commit history to Git (and before pushing to the new repo), I want to squash all the commits before a certain tag into one commit.

So, if I have:

commit 6
commit 5
commit 4
commit 3
commit 2
commit 1 -- First commit ever

I want to end up with:

commit 6
commit 5
commit X -- squashed 1, 2, 3, 4

Note: There are thousands of commits that I need to squash. So, manually picking/marking them one by one is not an option.

like image 362
GreenSaguaro Avatar asked Dec 03 '18 05:12

GreenSaguaro


People also ask

How do you squash last n commits into a single commit?

Suppose we want to squash the last commits. To squash commits, run the below command: $ git rebase -i HEAD ~3.


1 Answers

The other answers so far suggest rebase. This can work, in some cases, depending on the commit graph in the converted-to-Git repository. The new fancier rebase with --rebase-merges can definitely do it. But it's kind of a clumsy way to go about it. The ideal way to do this is to convert commits starting at the first one you want to keep. That is, have your Mercurial exporter export to Git, as Git's first commit, the revision you want to pretend is the root. Have the Mercurial exporter go on to export that commit's descendants, one at a time into the importer, in the same way that the exporter was always going to do this job (whatever way that may be).

Whether and how you can do this depends on what tool(s) you are using to convert. (I have not actually done any of these conversions, but most people seem to use hg-fast-export and git fast-import. I have not looked much at the inner details of hg-fast-export but there's no obvious reason it couldn't do this.)


Fundamentally (internally), Mercurial stores commits as changesets. This is not the case for Git: Git stores snapshots instead. However, Mercurial checks out (i.e., extracts) snapshots, by summing together changesets as required, so if your tool works by doing hg checkout (or the internal equivalent thereof), there is no issue here in the first place: you just avoid checking out revisions prior to the first snapshot you want, and import those into Git, and the resulting Git history will begin at the desired point.


If the tools you have make this inconvenient, though, note that after converting the entire repository history, including all branches and merges, into Git snapshots, your Git repository makes this relatively easy as a second pass. Your Git history might, e.g., look like this:

          o-..-o            o--o   <-- br1
         /      \          /
...--o--o--....--o--*--o--o--o--o   <-- br2
      \         /             \
       o--...--o               o   <-- master

where commit * is the first commit you wanted to see in your Git repository. (Note that if there are multiple histories going back before *, you have a different issue and cannot do this kind of transformation in the first place without additional history-modification. But as long as * is on a sort of choke point, as it is in this diagram, it's easy to snip the graph here.)

To remove everything before *, simply use git replace to make an alternative commit that's very much like commit *, but has no parent:

git replace --graft <hash-of-*>

You now have a replacement that most of Git will use instead of *, that has no parent commit. Then run git filter-branch over all branches and tags, with the no-op filter:

git filter-branch --tag-name-filter cat -- --all

Or, once git filter-repo is included with Git (or if you've installed it):

git filter-repo --force

(be careful with the --force option when using filter-repo: this makes it destroy the old history in this repository, but in this csae, that's what we want).

This will copy every reachable commit, including the substitute * but excluding * and its own history, to new commits, then update your branch and tag names.

If using filter-branch, remove the refs/originals/ name-space (see the git filter-branch documentation for details), force early scavenging of the original objects if you like (the extra commits will eventually fall away on their own), and you're done.

like image 169
torek Avatar answered Sep 29 '22 18:09

torek