Reapply Git commits from copied fork repository to original repository

A university colleague of mine thought it was a good idea to fork a repository by cloning it and copy its contents into a new, freshly initialized repository but without the .git folder from the original repository. Afterwards, he simply committed this copy using a single commit and the whole team began working on the project based on this commit:

A <- B <- C     <- D <- E    (original repository)
\  clone  /        |_____| 
 \       /            |
  \     /     Ofc. work on the original repository was continued after cloning...
   \   /
     M <- N <- O <-P    (our "fork", commits from my team)

Now, my first goal is to get the following repository structure:

A <- B <- C <- N <- O <- P

What I have been trying to do now during the past few hours is the following:

    • Clone the original repository.
    • git diff > /path/to/patch from within the fork.
    • git apply within the original repository.
    • Works, but does not preserve the commits.
  1. Various other things that will not work.
    • Clone the original repository.
    • Create and switch to a new branch.
    • Reset it to the commit A using git reset --hard COMMIT_HASH_A.
    • Create a patch from N <- O <- P using git format-patch COMMIT_HASH_M --stdout > /path/to/patch on the fork.
    • Apply this patch on the original repository using git am -3 /path/to/patch. After resolving several conflicts such as the duplicate creation of empty files, this will result in the following error: fatal: sha1 information is lacking or useless (some_file_name). Repository lacks necessary blobs to fall back on 3-way merge. This is where I cannot get on.

So how do I create a repository including all commits from the original repository and from our team as described, so that eventually, a pull request could be sent to the original repository? Might a git-rebase help?

In your original repo clone, you should:

git remote add colleague /path/to/colleague
git fetch colleague
git checkout -b colleague colleague/master
git rebase master
git checkout master
git merge colleague

This will give you linear history and will not leave behind a redundant and parent-less M commit.

This is different from David Siro's answer, which will produce a merge commit that also leaves a redundant/parent-less M commit floating around in the branch you merged from. I don't like that dangling-commit scenario.

Original Post

I replicated your good and bad repository histories and was able to solve the problem by basically rebasing a remote.

These are the steps I followed:

  1. Clone original repository
  2. Add a remote to the bad repo
  3. Fetch the bad repo master branch
  4. Branch into the fetched bad repo
  5. Rebase the bad master branch to your master (will claim some changes are already applied)
  6. Merge this branch into your master
  7. Push back to original repository
  8. Schedule you colleague's demise 😉

With that setup, the commands I used and key output follows.

# Step 1
$ git clone <path-to-original-repo>
$ cd original-repo

# Step 2
$ git remote add messed-up-repo <path-to-messed-up-repo>

# Step 3
$ git fetch messed-up-repo

# Step 4
$ git checkout -b bad-master bad-orig/master

# Step 5
$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: commit M
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.
Applying: commit N
Applying: commit O
Applying: commit P

# Step 5.1: look at your new history
$ git log --oneline --graph --decorate
* cc3121d (HEAD -> bad-master) commit P
* 1144414 commit O
* 7b3851c commit N
* b1dc670 (origin/master, origin/HEAD, master) commit E
* ec9eb4e commit D
* 9c2988f commit C
* 9d35ed6 commit B
* ae9fc2f commit A

# Step 6
$ git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
$ git merge bad-master 
Updating b1dc670..cc3121d
 n.txt | 1 +
 o.txt | 1 +
 p.txt | 1 +
 3 files changed, 3 insertions(+)
 create mode 100644 n.txt
 create mode 100644 o.txt
 create mode 100644 p.txt

# Step 7
$ git push
Counting objects: 9, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (9/9), 714 bytes | 0 bytes/s, done.
Total 9 (delta 3), reused 0 (delta 0)
To /tmp/repotest/good-orig.git
   b1dc670..cc3121d  master -> master

# Step 7.1: look at your history again
$ git log --oneline --graph --decorate
* cc3121d (HEAD -> master, origin/master, origin/HEAD, bad-master) commit P
* 1144414 commit O
* 7b3851c commit N
* b1dc670 commit E
* ec9eb4e commit D
* 9c2988f commit C
* 9d35ed6 commit B
* ae9fc2f commit A

You can now destroy your colleague's messed up repository with fire and get others to continue using the original, and now fixed, repository.

Note: In your post, you said you wanted commits:

A <- B <- C <- N <- O <- P

But my solution includes commits D and E inbetween: A <- B <- C <- D <- E <- N <- O <- P. If you really want to throw those commits away, i.e. assuming it's not a typo in your post, then you can simply git rebase -i HEAD~5, remove the pick lines for those commits, and then git push --force to your good repo's origin.

I'm assuming you understand the implications of re-writing history and that you need to communicate with your users so that they don't get bit by it.

For the sake of completeness, I replicated your setup as follows:

  1. Create original good repo history: A <- B <- C
  2. Manually copied original contents to messed up repo
  3. Generate messed up commit history: M <- N <- O <- P, where M has the same content as original A <- B <- C
  4. Add work to original repo: ... C <- D <- E
If you don't insist on linear history, you can merge your fork into original repository.

In the original repo drirectory:

git remote add fork /path/to/fork
git fetch fork
git merge fork/master

This will preserve commits and may result in liner history (no merge commit) if the merge can be fast-forwarded.

