Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I re-integrate a svn and git repository without a common history?

Tags:

git

git-svn

I have a github-based git repository that represents development up to a certain point, and then a svn repository, not initialized with git svn, that has further development. I want to bring the svn changes into the git repository, start using the git repo for development, and push changes using git svn dcommit. Is this possible? Is it advisable?

Here's my specifics:

We started development on a WordPress plugin here:

http://github.com/mrdoornbos/wpconfidentcaptcha

Master is at ef82b94a1232b44aae3e, and no further changes were made in github.

When our application to wp-plugins.org was accepted, an empty svn repo was created for us:

http://svn.wp-plugins.org/wp-confident-captcha/trunk@278927

Somewhat modified files were then copied in (r256425). Further changes were made, the last being r278935.

What I want is for the SVN changes to be applied to master, along with the git svn metadata.

Here's what I have so far (takes about 4 minutes):

git clone git://github.com/mrdoornbos/wpconfidentcaptcha.git github_cc
cd github_cc
git svn init --stdlayout --prefix="svn/" http://svn.wp-plugins.org/wp-confident-captcha
git svn fetch --revision 256362:278935

This puts my github tree in origin/master, and my svn tree in svn/trunk (and all the tags in their own /svn branches as well). There is no common ancestor between origin/master and svn/trunk. I'm not sure where to go from here, or if there is a way to get the changes from svn/trunk onto origin/master, so that the head of the two repos have identical files, and let git svn dcommit work from origin/master.

Starting over with a new github repo seems like the most straightforward way, and I wouldn't be sad about losing the early history. But, it seems like there should be a way to make this work with the existing github repo.

(Edit: it looks like this was already asked as How to merge two branches without a common ancestor?, but without the git filter-branch example needed to make it work. Unlike that question, these are public svn and git repos, so an answer with a working script is possible.)

like image 350
jwhitlock Avatar asked Oct 05 '10 15:10

jwhitlock


2 Answers

Here's what worked for me:

  1. Import the git and svn histories into one repository,
  2. Use grafts and filter-branch to attach the svn tree to the git head, and
  3. Reset the git-svn metadata to use the new history.

Import Histories

This part was already described in the question:

$ git clone git://github.com/mrdoornbos/wpconfidentcaptcha.git github_cc
$ cd github_cc
$ git svn init --stdlayout --prefix="svn/" http://svn.wp-plugins.org/wp-confident-captcha
$ git svn fetch --revision 256362:278935 # Takes about 4 minutes

Now the history looks like this (friendly commit names in parens):

$ git log --oneline --graph svn/trunk
* d9c713a (svn-z) Bump stable to 1.5.4
* 3febe34 (svn-y) Set display style to modal
... (other commits in svn tree)
* 2687d6a (svn-b) initial checkin
* 5c48853 (svn-a) adding wp-confident-captcha by mrdoornbos

$ git log --oneline --graph master
* ef82b94 (git-z) putting js file back
... (other commits in git tree)
* 8806456 (git-a) initial import

There are basically two independent histories in the repository, and it will take some gymnastics to join them.

Graft, Merge, and Filter to Rewrite History

In part 2, I use a graft to make the last git commit the parent of the first svn commit:

$ GRAFT_PARENT_GIT=`git log --pretty=format:'%H' -1 master`
$ GRAFT_FIRST_SVN=`git log --pretty=format:'%H' svn/trunk | tail -n1`
$ echo $GRAFT_FIRST_SVN $GRAFT_PARENT_GIT > .git/info/grafts
$ cat .git/info/grafts
5c48853d69cac0a4471fe96debb6ab2e2f9fb604 ef82b94a1232b44aae3ee5a998c2fa33acb6dcb0

Now the merge is super smooth:

$ git merge svn/trunk
Updating ef82b94..d9c713a
Fast-forward
 .gitignore                                   |    3 -
(rest of merge lines removed)

$ git log --oneline --graph master
* d9c713a (svn-z) Bump stable to 1.5.4
* 3febe34 (svn-y) Set display style to modal
... (other commits in svn tree)    
* 2687d6a (svn-b) initial checkin
* 5c48853 (svn-a) adding wp-confident-captcha by mrdoornbos
* ef82b94 (git-z) putting js file back

$ git svn info
Path: .
URL: http://svn.wp-plugins.org/wp-confident-captcha/trunk
Repository Root: http://svn.wp-plugins.org
Repository UUID: b8457f37-d9ea-0310-8a92-e5e31aec5664
Revision: 278935
Node Kind: directory
Schedule: normal
Last Changed Author: Confident Technologies
Last Changed Rev: 278935
Last Changed Date: 2010-08-21 00:04:49 -0500 (Sat, 21 Aug 2010)

This would work, but grafts aren't pushed to repos. If I stick with the graft strategy, then everyone else who wants to work with the svn repo will have to recreate the graft themselves. This is easy enough to script, but this is a case where I can do better, using git filter-branch. This command is used to re-write git history, and has some really powerful options. However, the default command does exactly what I want: recompute commit hashes, taking into account any 'fake' parents added by grafts:

$ git filter-branch master
Rewrite d9c713a99684e07c362b213f4eea78ab1151e0a4 (71/71)
Ref 'refs/heads/master' was rewritten

$ git log --oneline --graph master
* 51909da (svn-z') Bump stable to 1.5.4
* 7669355 (svn-y') Set display style to modal
... (other re-hashed commits in svn tree)  
* aed5656 (svn-b') initial checkin
* 0a079cf (svn-a') adding wp-confident-captcha by mrdoornbos
* ef82b94 (git-z) putting js file back

Now the git history looks like a proper sequence of changes, and others will see the same sequence without messing around with grafts.

Recreate git-svn Metadata

Git is happy, but git-svn isn't:

$ git svn info
Unable to determine upstream SVN information from working tree history

$ git log --oneline --graph svn/trunk
* d9c713a (svn-z) Bump stable to 1.5.4
* 3febe34 (svn-y) Set display style to modal

git-svn keeps it's own metadata about commits (in .git/svn/*), and looks to the refspec refs/remotes/svn/trunk branch (as set in the config during git svn init) to determine what the svn head commit is. I need to point the svn trunk to the new commit, and then recreate the metadata. This is the part that I'm not 100% sure about, but it works for me:

$ GIT_NEW_SVN_TRUNK=`git log --pretty=format:'%H' -1 master`
$ echo $GIT_NEW_SVN_TRUNK
51909da6a235b3851d5f76a44ba0e2d128ded465
$ git update-ref --no-deref refs/remotes/svn/trunk $GIT_NEW_SVN_TRUNK
$ rm -rf .git/svn  # Clear the metadata cache
$ git svn info     # Force a rebuild of the metadata cache
Migrating from a git-svn v1 layout...
Data from a previous version of git-svn exists, but
  .git/svn
  (required for this version (1.7.3.1) of git-svn) does not exist.
Done migrating from a git-svn v1 layout
Rebuilding .git/svn/refs/remotes/svn/trunk/.rev_map.b8457f37-d9ea-0310-8a92-e5e31aec5664 ...
r256362 = 0a079cfe51e4641da31342afb88f8b47a0b3f2f3
r256425 = aed565642990be56edc5d1d6be7fa9075bab880d
(...more lines omitted)
r278933 = 766935586d22770c3ef536442bb9e57ca3708118
r278935 = 51909da6a235b3851d5f76a44ba0e2d128ded465
Done rebuilding .git/svn/refs/remotes/svn/trunk/.rev_map.b8457f37-d9ea-0310-8a92-e5e31aec5664
Path: .
URL: http://svn.wp-plugins.org/wp-confident-captcha/trunk
(...and the rest of the git svn info output from above)

Recreate git-svn Metadata on Clone

If someone is cloning from my git repo, they get most of the git-svn metadata in the form of commit messages, but not enough to use git-svn themselves. Most people won't need to, but someday I'll need to set up a new computer or train my replacement. Here's what worked for me:

$ cd ..
$ git clone github_cc github_cc2
$ cd github_cc2
$ git svn init --stdlayout --prefix="svn/" http://svn.wp-plugins.org/wp-confident-captcha
$ git update-ref --no-deref refs/remotes/svn/trunk 51909da6a235b3851d5f76a44ba0e2d128ded465
$ git svn info
Rebuilding .git/svn/refs/remotes/svn/trunk/.rev_map.b8457f37-d9ea-0310-8a92-e5e31aec5664 ...
r256362 = 0a079cfe51e4641da31342afb88f8b47a0b3f2f3
r256425 = aed565642990be56edc5d1d6be7fa9075bab880d
(...more lines omitted)
r278933 = 766935586d22770c3ef536442bb9e57ca3708118
r278935 = 51909da6a235b3851d5f76a44ba0e2d128ded465
Done rebuilding .git/svn/refs/remotes/svn/trunk/.rev_map.b8457f37-d9ea-0310-8a92-e5e31aec5664
Path: .
URL: http://svn.wp-plugins.org/wp-confident-captcha/trunk
(...and the rest of the git svn info output from above)

Now the svn trunk is ready. To get the tags, I had to re-fetch:

$ git svn fetch -r256362:278935
(Lots of output, seemed to be about 4 minutes again
$ git svn rebase # Fetch the rest of svn history and update metadata

I'm not sure if this exact sequence will work after there is more history in the tree.

I got some messages during git svn rebase:

W: Refspec glob conflict (ref: refs/remotes/svn/trunk):
expected path: wp-confident-captcha/branches/trunk
    real path: wp-confident-captcha/trunk
Continuing ahead with wp-confident-captcha/trunk

I fixed these by manually setting the svn configuration in .git/config:

[svn-remote "svn"]
  url = http://svn.wp-plugins.org
  fetch = wp-confident-captcha/trunk:refs/remotes/svn/trunk
  branches = wp-confident-captcha/branches/*:refs/remotes/svn/branches/*
  tags = wp-confident-captcha/tags/*:refs/remotes/svn/tags/*

Summary

This is a lot of work to make git svn rebase and git svn dcommit work. I learned a whole lot about git and git svn, but I'm not convinced the end goal was worth it. For this use case (occasionally update an svn repository to the HEAD of the git repository), some custom scripts might have been more effective.

like image 185
jwhitlock Avatar answered Sep 21 '22 09:09

jwhitlock


Just merge it?

git checkout master
git merge -X theirs svn/trunk
like image 29
al. Avatar answered Sep 22 '22 09:09

al.