I contribute to an open source project on GitHub. Created a pull request, which was reviewed by the maintainer, who asked me to rework some stuff. In the meantime other good people contributed too. So first I used git rebase master to rebase my pull request on top of the latest master branch, and then git rebase -i HEAD~5, the interactive rebase feature to fix some of my commits, then git push --force to my own remote branch.
However, after that, GitHub thinks that my branch can no longer be merged into the master branch, due to conflicts. In fact, it thinks that some other commits (not mine) were added to my branch, which obviously conflict with the same ones in master.
What did I do wrong, and how can I do it correcty?
Assume the master branch has the following history
(master) A -> B -> C -> D
Then I commit X, Y, and Z, so my branch history is:
(my-branch) A -> B -> C -> D -> X -> Y -> Z
Let's also assume that someone committed and pushed E to master in the meantime:
(master) A -> B -> C -> D -> E
So first I do git rebase master on my-branch. After this my branch history is:
(my-branch) A -> B -> C -> D -> E -> X -> Y -> Z
Then I do git rebase -i HEAD~5 like this:
pick D
pick E
pick X
sqash Y
edit Z
Note that I don't change D or E.
After this my history looks like:
(my-branch) A -> B -> C -> D -> E -> X -> Z
Then I do git push myremote my-branch --force.
After this, GitHub claims that my pull request can no longer be merged, because commits C, D, and E on my branch conflict with C, D, and E on master. Note that I didn't edit these commits, only my own (and C was not even part of the interactive rebase).
Why is git behaving like this?
How can I avoid git interactive rebase messing up a pull request?
You must be a little bit careful with interactive rebase (or any rebase, really, but it tends to show up much more often with interactive rebase since it's so flexible).
It may help to think about the commit graph in more detail. Specifically, this kind of drawing is inherently somewhat wrong:
(master) A -> B -> C -> D
It's much better to draw this as:
A <-B <-C <-D <--master
The key item here is putting the branch name on the right, with an arrow coming out of it. The rest of the arrows can be converted into straight linkages:
A--B--C--D <-- master
because all commits are read-only. It's hard to draw the internal arrows, so knowing that they literally can't change, we can draw them as connecting lines. The reason to draw them in, at least initially, is that the arrows are connected to the children, not the parents: the children know who their parent commits are—D knows to reach back to C, for instance—but the parents have no idea who their children are. There is no way to go from A to B; we can only go from B back to A, and then realize that hey, we got here from B, so there must be a path from A to B after all.
This is a key realization when working with Git: it does everything backwards.
The reason all of this matters has to do with what happens during parallel development. As you said, you made X, Y, and Z:
A--B--C--D <-- master
\
X--Y--Z <-- my-branch
Your X knows about D, but D—which you share with several other Git repositories—knows nothing about your X.
Meanwhile, someone else made E, which also points back to D. If we imagine some sort of godlike super-repository that somehow knows about E and X-Y-Z all at the same time, we can draw it like this:
E <-- third-person
/
A--B--C--D <-- master
\
X--Y--Z <-- my-branch
and then they—"they" here being the open-source group on GitHub—adopted this third person's E into their own master:
E <-- master, third-person
/
A--B--C--D
(they still don't know about your X-Y-Z yet, perhaps, so we just take them off the drawing). Now you run git fetch upstream to your local computer to pick up the GitHub repository versions, and in your repository, which does still have X-Y-Z, you have:
E <-- upstream/master
/
A--B--C--D <-- master, origin/master
\
X--Y--Z <-- my-branch (HEAD), origin/my-branch
(I'm assuming here that your origin represents your GitHub fork, and your upstream represents the original GitHub repository from which you made your fork).
When you run git rebase upstream/master, or make your master point to commit E and run git rebase master, you have your Git make copies of commits X, Y, and Z:
X'-Y'-Z' <-- my-branch (HEAD)
/
E <-- upstream/master
/
A--B--C--D <-- master, origin/master
\
X--Y--Z <-- origin/my-branch
(this drawing assumes you have not bothered to update your master; if you have, we could draw it a bit differently).
These copies—the ones labeled with tick marks, X'-Y'-Z'—are ready for you to force-push to your fork. Assuming you do that at this point, your fork will have them and the original X-Y-Z chain will be abandoned:
X'-Y'-Z' <-- my-branch (HEAD), origin/my-branch
/
E <-- upstream/master
/
A--B--C--D <-- master, origin/master
\
X--Y--Z [abandoned]
If we update your own master and origin/master as well, we'll be able to redraw all of this as the simpler:
A--B--C--D--E <-- master, origin/master, upstream/master
\
X'-Y'-Z' <-- my-branch (HEAD), origin/my-branch
(all of these being in your repository on your computer—on GitHub, in your fork, these are just named master and my-branch without the origin/ part and there are no upstream/ parts at all).
Your pull request will automatically be updated to use the new X'-Y'-Z' commits, which simply add on to commit E—so if their repository still ends at commit E, there should be no issues with them absorbing your commits.
If you wish to squash Y' into X' and do something with Z''s message, you can do that now using interactive rebase. Or, you can do all of this instead of the initial rebase:
$ git checkout my-branch
$ git rebase -i upstream/master
Note that upstream/master here really means "commit E" in our graph, which is one of those that has E as upstream/master and either Z or Z' as your own tip of my-branch.
Git will now make a list of the three commits that are on my-branch but are not on upstream/master, i.e., starting from Z or Z' and working backwards towards A, list those commits, then remove from that list, all commits starting from E and working backwards. These three commits' hash IDs will go into the three pick commands.
You can now change one to squash and one to edit as before: since you're working only on your commits, and not on anyone else's, you won't be copying anyone else's commits to new and different commits. When you're done editing, you will have new commits yet again—if you already copied X to X', Y to Y', and Z to Z', now you have X" which is X' + Y' (squashed), and Z" which is the edited version of Z' but connects to X":
X"-Z" <-- my-branch (HEAD)
/
A--B--C--D--E <-- master, origin/master, upstream/master
\
X'-Y'-Z' <-- origin/my-branch
You can now force-push these to your fork, so that their my-branch (which you are calling origin/my-branch in your Git repository here) changes to point to commit Z", and now your fork is updated again and your pull request is auto-updated and you're (still, or finally for real) ready for them to look at X" and Z".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With