Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I prepend history to a Git repository?

Tags:

git

svn

git-svn

I have a project that has existed in two SVN repositories. The second SVN repository was created simply by adding the repositories from a checkout of the old SVN repository without SCM information stripped. The content of the files are byte identical, but there is no associated SCM meta-data.

I have taken the new SVN repository and ported it into a Git repository via git-svn. Now I would like to import the old repository and somehow get it to link the new repository so I can see the history across both. Is there a simple way to do this without hand stitching the two repositories together?

like image 911
shemnon Avatar asked Aug 03 '09 03:08

shemnon


People also ask

How do I rewrite history in GitHub?

There are many ways to rewrite history with git. Use git commit --amend to change your latest log message. Use git commit --amend to make modifications to the most recent commit. Use git rebase to combine commits and modify history of a branch.

Which command is to view the history of git repository?

`git log` command is used to view the commit history and display the necessary information of the git repository. This command displays the latest git commits information in chronological order, and the last commit will be displayed first.

Where does git keep history?

Git stores the complete history of your files for a project in a special directory (a.k.a. a folder) called a repository, or repo. This repo is usually in a hidden folder called . git sitting next to your files.


2 Answers

See also: the How do I re-play my commits of a local Git repository, on top of a project I forked on github.com? question (and my answer there), although the situation is slightly different, I think.


You have at least three possibilities:

  • Use grafts to join two histories, but do not rewrite history. This means that you (and anybody who has the same grafts) would have full history, while other users would have a smaller repository. This also avoids problems with rewritten history if somebody already started working on top of the converted repository with a shorter history.

  • Use grafts to join two histories, and check that it is correct using "git log" or "gitk" (or other Git history browser/viewer), then rewrite history using git filter-branch; then you can remove the grafts file. This means that everybody who clones (fetches) from a rewritten repository would get the full, joined history. But rewriting history is a big no if somebody already based work on converted short-history repository (but this case might not apply to you).

  • Use git replace to join two histories. This would allow people to select whether they want full history, or just current history, by choosing to fetch refs/replace/ (then they get full history) or not (then they get short history). Unfortunately this requires currently to use a yet unreleased version of Git, using the development ('master') version, or one of the release candidates for 1.6.5. The refs/replace/ hierarchy is planned for the upcoming Git version 1.6.5.


Below there are step-by-step instructions for all those methods: grafts (local), rewriting history using grafts, and refs/replace/.

In all cases I assume that you have both the current and historical repository history in a single repository (you can add history from another repository using git remote add). I also assume that (one of) the branches in the short-history repository is named 'master', and that the branch (commit) of the historical repository where you want to attach current history is called 'history'. You would have to substitute your own branch names (or commit IDs).

Finding commit to attach (root of short history)

First, you have to find the (SHA-1 identifier of) commit in short-history that you want to attach to the full history. It would be the first commit in the short history, i.e. the root commit (the commit without any parents).

There are two ways of finding it. If you are sure that you do not have any other root commit, you can find the last (bottommost) commit in topological order, using:

$ git rev-list --topo-order master | tail -n 1

(where tail -n 1 is used to get the last line of the output; you don't need to use it if you don't have it.)

If there is possibility of multiple root commits, you can find all parentless commits using the following one-liner:

$ git rev-list --parents master | grep -v ' '

(where grep -v ' ', that is, space between single quotes, is used to filter out all commits which have any parents). Then you have to check (using e.g. "git show <commit>") those commits if there are more than one, and select one that you want to attach to earlier history.

Let's call this commit TAIL. You can save it in a shell variable using (assuming that simpler method works for you):

$ TAIL=$(git rev-list --topo-order master | tail -n 1)

In the description below I would use $TAIL to mean that you have to substitute the SHA-1 of the bottommost commit in the current (short) history... or allow the shell to do the substitution for you.

Finding a commit to attach to (top of the historical repository)

This part is simple. We have to the convert the symbolical name of the commit into an SHA-1 identifier. We can do this using "git rev-parse":

$ git rev-parse --verify history^0

(where 'history^0' is used in place of 'history' just in case if 'history' is a tag; we need the SHA-1 of the commit, not of a tag object). Similarly, like finding a commit to attach, let's name this commit ID TOP. You can save it in a shell variable using:

$ TOP=$(git rev-parse --verify history^0)

Joining history using a grafts file

The grafts file, located in .git/info/grafts (you need to create this file if it doesn't exist, if you want to use this mechanism) is used to replace the parent information for a commit. It is line-based format, where each line contains the SHA-1 of a commit we want to modify, followed by zero or more space-separated lists of commits we want for given commit to have as parents; the same format that "git rev-list --parents <revision>" outputs.

We want $TAIL commit, which doesn't have any parents, to have $TOP as its single parent. So in the info/grafts file there should be a line with the SHA-1 of the $TAIL commit, separated by space by the SHA-1 of the $TOP commit. You can use the following one-liner for this (see also examples in git filter-branch documentation):

$ echo "$TAIL $TOP" >> .git/info/grafts

Now you should check, using "git log", "git log --graph", "gitk" or other history browser that you joined histories correctly.

Rewriting history according to the grafts file

Please note that this would change history!

To make history as recorded in grafts file permanent, it is enough to use "git filter-branch" to rewrite the branches you need. If there is only a single branch that needs to be rewritten ('master'), it can be as simple as:

$ git filter-branch $TOP..master

(This would process only minimal set of commits). If there are more branches affected by joining history, you can simply use

$ git filter-branch --all

Now you can delete the grafts file. Check if everything is like you wanted, and remove backup in refs/original/ (see documentation for "git filter-branch" for details).

Using refs/replace/ mechanism

This is an alternative to the grafts file. It has the advantage that it is transferable, so if you published the short history and cannot rewrite it (because other based their work on the short history), then using refs/replace/ might be a good solution... well, at least when Git version 1.6.5 gets released.

The refs/replace/ mechanism operates differently than a grafts file: instead of modifying the parent's information, you replace objects. So first you have to create a commit object which has the same properties as $TAIL, but has $TOP as a parent.

We can use

$ git cat-file commit $TAIL > TAIL_COMMIT

(The name of temporary file is only an example).

Now you need to edit 'TAIL_COMMIT' file (it would look like this):

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

Now you need to add $TOP as parent, by putting a line with "parent $TOP" (where $TOP has to be expanded to SHA-1 id!) between 'tree' header and 'author' header. After editing 'TAIL_COMMIT' it should look like this:

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
parent 0f6592e3c2f2fe01f7b717618e570ad8dff0bbb1
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

If you want, you can edit the commit message.

Now you need to use git hash-object to create a new commit in the repository. You need to save the result of this command, which is the SHA-1 of a new commit object, for example like this:

$ NEW_TAIL=$(git hash-object -t commit -w TAIL_COMMIT)

(Where the '-w' option is here to actually write the object to the repository).

Finally use git replace to replace $TAIL by $NEW_TAIL:

$ git replace $TAIL $NEW_TAIL

Now what is left to check (using "git log" or some other history viewer) if the history is correct.

Now anybody who wants to have the full history needs to add '+refs/replace/*:refs/replace/*' as one of pull refspecs.

Final note: I have not checked this solution, so your mileage may vary.

like image 158
Jakub Narębski Avatar answered Sep 18 '22 16:09

Jakub Narębski


First, create a graft point to attach the two histories. Then run git filter-branch over the repository to make the change permanent. This will change the commit IDs of all commits downstream of the graft, note.

like image 30
bdonlan Avatar answered Sep 17 '22 16:09

bdonlan