Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fix git history with duplicate commits

Tags:

git

Short version:
How do I go from the left graph to the right one. Also, do I need to manually clear the duplicate commits (root2, a', b') or the GC will prune them at some point in the future? before-after

Long version:
Due to a bad CVS to git port I ended up with two alternate histories in the same repo when I imported branch_1 from CVS when master had been already created in git. As can be seen, there are commits of branch_1 that are unique, but there are others that are duplicates. What is the easiest way to fix this?

I have some ideas but not sure how to execute them. One would be to remove branch_1 altogether and start again, but I don't know how to make git recognize that root1 and root2 are the same so that all patches get applied on the same line. Another idea would be to rebase f, g, h onto b and somehow remove root2, a' and b'. But I think that since there is no common ancestor, a normal rebase will not work

In reality, the duplicate commits and unique commits of the branch are hundreds so something very manual is not good

like image 726
Hilikus Avatar asked Apr 27 '15 20:04

Hilikus


1 Answers

One option you can try first, which is non-destructive, is to use a Graft Point:

# .git/info/grafts
0000000f 0000000b

Replace those with the SHA1s of commit f and b in your diagram.

This change will be non-permanent, but should be able to reviewed with:

git log --graph --all --decorate

If all looks good, you can run git filter-branch to make those changes permanent.

Note that, until you do a git filter-branch and push that, nobody else will see these changes. This could be considered a "feature", since it won't force anyone else to have to do a messy rebase of their work. This basically just adds some extra information that tells tools like git-log and such that, when they look at commit "f", they should pretend it has a parent of "b", without actually updating the commit object to reflect that. Changing the parent of a commit object changes its SHA1, so that means that its child commits need to be updated to point to the new SHA1, and then their children, etc.

like image 117
pioto Avatar answered Sep 28 '22 07:09

pioto