Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge several Git repos into one and interleave histories

My situation is that I have two Git repositories that I need to merge into a single repository (there are actually more repos, but I can start with two).

The two repositories are:

  • The main repository, A.
  • The second repository, B.

The code in repository B has dependencies on the code in repository A (but not vice versa), and the histories of both repositories follow each other in a chronological fashion - roughly (i.e. a specific commit in repo B will typically require a commit from repo A with a very similar commit time).

There are conflicting branch and tag names in both repositories (there are no guarantees that they belong together), but only the refs from A need to be preserved.

The requirements for the new repository, C, are:

  1. All refs (branches and tags) from A need to be preserved.
  2. Only the master branch commits from B need to be preserved (i.e. the commits that are reported by git log --first-parent master).
  3. The files from each source repository should be put into subfolders of the new repository (i.e. the files from A shall go into A/, and the files form B shall go into B/).
  4. When checking out a specific commit (including commits done before the merge) in repository C (e.g. a release tag) compatible files form both source repositories should be found in the directories A/ and B/ (at least within a commit or two).

So far I have tried several approaches, including this and git-stitch-repo, without success (they did not fulfill the above requirements).

At this point, I have managed to:

  • Move all files in each repo to a subdirectory using git filter-branch. E.g. for repo A:
mkdir A
mv * .gitignore A/ 2> /dev/null
git commit -a -m 'DROPME' > /dev/null
git filter-branch --tag-name-filter cat --index-filter 'git ls-files -s | sed "s-\t\"*-&A/-" | GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info && mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE" ||:' -- --all
git reset --hard origin/master
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
  • Import repo B into A using git fast-export/fast-import.
  • Device a method for generating a mapping such that for a given SHA in A, there is a list of zero, one or more SHA:s that should be inserted from B.

What I would expect now, is that some clever usage of git filter-branch should enable me to insert the selected commits from B into the master branch of A. But how?

like image 326
m-bitsnbites Avatar asked Dec 05 '16 14:12

m-bitsnbites


People also ask

How do I merge Git repositories and keep history?

To combine two separate Git repositories into one, add the repository to merge in as a remote to the repository to merge into. Then, combine their histories by merging while using the --allow-unrelated-histories command line option.

Can I merge 2 repos?

You can merge repository A into a subdirectory of a project B using the subtree merge strategy. This is described in Subtree Merging and You by Markus Prinz. (Option --allow-unrelated-histories is needed for Git >= 2.9.


1 Answers

The solution turned out to be much more involved than I had hoped for. It involves manipulating and combining the output of two (or more) git fast-export streams, and importing them into a new repository using git fast-import.

In short, a new fast-import stream is generated by traversing two input streams, and switching back-and-forth between them based on a date-sorted log from the main branches.

I have implemented the solution in a Python script called join-git-repos.py, that I put in a GitHub repository here.

like image 111
m-bitsnbites Avatar answered Oct 08 '22 08:10

m-bitsnbites