Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I combine several Git repositories without breaking file history?

Tags:

git

tfs

git-tfs

We are trying to migrate away from TFS. Using the git-tfs tool, we were able to migrate parts of the existing repo, but it crashes at certain troublesome checkins. We have been able to make a patchwork set of Git repos that cover most of the original TFS commits.

Currently have:

  • Git repo with changes from 2009 until 2011
  • Git repo with changes from 2011 until 2016
  • Git repo with changes from 2016 until current

Desired:

  • Big Git repo that covers 2009 until current
  • any file that existed that whole time would have a single file history

Is there any way for us to stitch these back together into a single Git repo? We don't care about retaining SHAs (they're all new anyway), but we can't break file history.

like image 508
Scott Stafford Avatar asked Dec 21 '17 16:12

Scott Stafford


2 Answers

edit: recent versions of git has now extended the git replace command to do it more easily with git replace --graft <commit> <parent> (See https://git-scm.com/docs/git-replace#Documentation/git-replace.txt---graftltcommitgtltparentgt82308203 )


There is an easy way to do that using the 'graft' feature of git. it's a feature with the same goal than git replace that @torek mentioned but that is easier to use in your case.

First, import all the histories in the same repository. In the most recent repository, do for the 2 others:

  1. git remote add c:/path/toward/other/repository
  2. git fetch

Then create the git graft file .git/info/grafts following the help: https://git.wiki.kernel.org/index.php/GraftPoint (you should have 2 lines in your file)

If you use git log or any Git GUI, you now should see the history like you want it.

If you are satisfied, then rewrite the history to make it definitive with:

git filter-branch

You could now push your history to a central repository or share it.

Ps: another doc on the subject but melting grafts and replace git features : https://legacy-developer.atlassian.com/blog/2015/08/grafting-earlier-history-with-git/

like image 168
Philippe Avatar answered Oct 23 '22 10:10

Philippe


Git doesn't have file history.

Git stores commits, and commits are history. They are the only history there is. (I say it's not file history because it's commit history.) Each commit has a parent commit, or if the commit is a merge, two parents (or potentially more than two if it's an octopus merge).

Other than having a parent, each commit is a stand-alone snapshot of all the files that are in that commit. There's no history here: it's just a snapshot. If you want to see what happened between the previous commit and the current commit, you have Git extract the previous commit (snapshot O for Old) and the current commit (snapshot N for New) and run diff O N. That's what changed: whatever is different between O and N.

You can ask Git to synthesize a file history, but it does so by a horrible hack: it looks for one particular changed file, in each commit, as it goes back through commit history. It prints commits where that commit changes the file when compared to that commit's parent. If the file name changes—if the commit renamed the file—and you have used --follow, Git changes which (single) file name it's looking for, so now it's looking under the previous name.

If you have a history consisting of a sequence of commits:

(history starts here, at a root commit)
  |
  v

  o--o--<branches and merges...>--o   <-- end

and a second history:

  o--o--<branches and merges...>--o   <-- end

  o--o--...--o   <-- end2
  ^
  |
(we want to replace this one)

in a single repository, you can write a "replacement" commit object (using git replace) that is just like the second root commit that we want to replace, except for one thing: it has, as its parent commit, the commit to which end points.

This replacement commit effectively splices the two histories together.

Repeat this as desired for as many splices as you would like to add, for as many separate commit chains as you have in a single repository. Then you can run git filter-branch over this repository, which copies every commit, but follows the replacements. This has the effect of cementing the grafts in place. See What does git filter-branch with no arguments do? or Rebase entire git branch onto orphan branch while keeping commit tree intact for example.

like image 32
torek Avatar answered Oct 23 '22 10:10

torek