Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why should we never use rebase with commits that have been pushed

Tags:

git

Im still getting my feet wet with VSTS and Git. I understand the scenario where changes in the master branch need to get into the feature branch, but these "reminders" or tips dont make sense to me yet. What is meant by the statement below? https://docs.microsoft.com/en-us/vsts/git/tutorial/rebase?tabs=visual-studio

[quote]

Never rebase commits that have been pushed and shared with others. The only exception to this rule is when you are certain no one on your team is using the commits or the branch you pushed.

After reading a bit further, and coming from SVN, I think I see why the above statement was made:

Never force push a branch that others are working on. Only force push branches that you alone work with.

This would be "similar" to a situation with SVN where:(1) you have a local branch that others are working on and then (2) make a bug fix directly in trunk, then (3) merge those changes down to your local branch and (4) commit the updates in the local branch, thus forcing anyone else working on that local branch to get the update and potentially have a merge conflict

like image 884
bitshift Avatar asked Feb 09 '18 14:02

bitshift


People also ask

Why you should not use rebase?

Rebasing doesn't play well with pull requests, because you can't see what minor changes someone made if they rebased (incidentally, the consensus inside the Bitbucket development team is to never rebase during a pull request). Rebasing can be dangerous!

When should you avoid rebase?

If you use pull requests as part of your code review process, you need to avoid using git rebase after creating the pull request. As soon as you make the pull request, other developers will be looking at your commits, which means that it's a public branch.

Can you git rebase after push?

If you had already pushed changes before using THAT option, those changes wouldn't be rebased because they're already in the remote. The only exception may be if you have multiple remotes, and have pushed changes to one remote, then do a pull/rebase from another - that could cause serious problems.


2 Answers

Ashish Mathew's answer, which links to and quotes from the Pro Git book, is correct, but you may need a lot more background to really understand it. But I'd like to start out by saying that the word "never" is too strong. It's OK to rebase published commits under one condition: that everyone who will have to deal with the problems created, has agreed in advance to deal with the problems created.

But what, then, are the problems created by doing this? The answer is in that quote: rebase works by copying commits.

Git has one real "true name" for each commit, which is that commit's hash ID. That true name—the hash ID—is how Git finds the underlying data, and how, when you connect two Gits to each other, they transfer the data. (In fact, these hash IDs are used for all four of Git's internal object types, though you yourself will mostly deal with commits.)

The hash ID for any given commit is unique, and apparently totally random—but in fact, it's completely deterministic, having been computed from the data inside the commit. (It's a cryptograph hash of that data.) Hence your Git can connect to any other Git anywhere in the entire universe, and if your Git waves a raw hash ID like 8279ed033f703d4115bee620dccd32a9ec94d9aa at the other Git, the two Gits can immediately tell whether they both have that commit, or not. If both Gits have the commit, there's nothing to do; but if only one Git has the commit, the other Git will ask to get a copy.

(The transfer is always one way: git fetch has your Git call up another Git and download items from them, while git push has your Git call up another Git and send items to them. There's no fundamental reason you couldn't do both at the same time, but the commands are all written with unidirectional transfer in mind.)

This ability to do a very simple have/want exchange is how Git can rapidly transfer only the necessary objects: even if you have a fairly fat repository such as that for the Linux kernel—weighing in, today, at over 700k commits and about 2.4 GB of repository database—the git fetch command is quick:

$ time git fetch

real    0m0.457s
user    0m0.228s
sys 0m0.087s
$ 

(I ran an earlier git fetch this morning, which was a lot slower as I had not updated this copy of the kernel since late last year. That one took about 3 seconds of CPU time and about 10.5 seconds of real time, to bring over 11853 objects.)

Anyway, the short version of all of this is that Git tends to be like the Borg of source control systems: whatever you have, when I connect my Git to yours, I add everything you have that I don't to my repository. I keep everything I had before!

So, if you use git rebase on published commits—commits that I have now, because I got them from you earlier—you will, as in the quote, copy some of your existing commits to new commits that you think are new-and-improved. You will then switch to the new commits, abandoning the yucky old ones that you've copied. When I connect my Git to your Git, I'll end up with both the old and the new.

That doesn't seem so bad—but the problem is that my Git treats all commits in my repository as precious, so now I have both the old ones and the new ones. I haven't abandoned the old ones. If I've built new commits that use your old ones, I now have to somehow separate the work I've built from the work you've copied.

There are tools that can help with this—particularly git rebase --fork-point—but they're not the most wonderful things ever. They need to be used fairly quickly, right after I pick up your rebased commits, to be effective. So I will need to know, preferably in advance, that you will be rebasing your published work—your commits that I already have—so that I am prepared to do anything I must do to rebase my work on your rebased work.

If we have all agreed to this in advance, and we all know how to do it, then it is OK to rebase your published commits. If not, well, you might be making a lot of work for someone else, perhaps many "someone else"s, who may not know how to use the (not so great) tools for dealing with an "upstream rebase".

like image 188
torek Avatar answered Sep 25 '22 21:09

torek


A Rebase operation is a potentially "destructive" one because precisely because it is so powerful. The power of rebase comes from the fact that it can rewrite the commit history of a particular branch. This doesn't seem like a big deal until you consider what happens in any situation where two or more people disagree on a historical record.

When you push a branch up to the server, you're not just pushing the current state of the branch, you're pushing its history as well. Any future merge or branching will take this history into account when applying/comparing changes.

If you rebase a branch that's already been pushed to the server, anyone who had already pulled the branch will now be unable to plot a clean path from their commits to yours. Thus, they will be faced with inconsistent histories and merge conflicts rippling across the repository

The unfortunate souls dealing with this will likely not have warm and fuzzy feelings towards you, because you've just cost them a non-trivial amount of time and effort.

like image 41
Josh E Avatar answered Sep 26 '22 21:09

Josh E