Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the conceptual differences between Merging, Stashing and Rebasing in Git?

I have been using Merging heavily on master branch. But recently for a feature development in my situation merging seemed complicated for the project history. I came across Rebasing which solves my problem. I also came across the golden rule of rebasing while solving the problem.

I also used Stashing at times, it worked, but I feel like the same thing could have been achieved with merging as well.

Although I use these commands, I feel like if someone can explain the conceptually outstanding facts/rule about these three commands it would help me to get a clearer understanding. Thanks.

like image 277
Syed Priom Avatar asked Nov 16 '16 19:11

Syed Priom


1 Answers

Let's say you have this repository. A, B, and C are commits. master is at C.

A - B - C [master]

You make a branch called feature. It points to C.

A - B - C [master]
          [feature]

You do some work on both master and feature.

A - B - C - D - E - F [master]
         \
          G - H - I [feature]

You want to update feature with the changes from master. You could merge master into feature, resulting in a merge commit J.

A - B - C - D - E - F [master]
         \           \
          G - H - I - J [feature]

If you do this enough times, things start to get messy.

A - B - C - D - E - F - K - L - O - P - Q [master]
         \           \       \       \ 
          G - H - I - J - M - N - Q - R - S [feature]

That might look simple, but that's because I've drawn it that way. Git history is a graph (in the computer science sense) and there's nothing that says it has to be drawn like that. And there's nothing which explicitly says, for example, commit M is part of branch feature. You have to figure that out from the graph and sometimes that can get messy.

When you decide you're done and merge feature into master, things get messy.

A - B - C - D - E - F - K - L - O - P - Q - T [master]
         \           \       \       \     /
          G - H - I - J - M - N - Q - R - S [feature]

Now it's difficult to tell that M was originally part of feature. Again, I've chosen a nice way to draw it, but Git doesn't necessarily know to do that. M is an ancestor of both master and feature. This makes it difficult to interpret history and figure out what was done in which branch. It can also cause unnecessary merge conflicts.


Let's start over and rebase instead.

A - B - C - D - E - F [master]
         \
          G - H - I [feature]

Rebasing a branch onto another branch is conceptually like moving that branch to the tip of the other. Rebasing feature onto master is like this:

                      G1 - H1 - I1 [feature]
                     /
A - B - C - D - E - F [master]
         \
          G - H - I

Every commit in feature is replayed on top of master. It's as if you took the diff between C and G, applied it to F, and called that G1. Then the diff between G and H gets applied to G1, that's H1. And so on.

There's no merge commit. It's as if you wrote the feature branch on top of master all along. This keeps a nice, clean, linear history that isn't littered with merge commits that don't tell you anything.

Note that the old feature branch is still there. It's just that nothing points to it and it will eventually be garbage collected. This is there to show you that rebase does not rewrite history; instead, rebase creates new history and then we pretend it was that way all along. This is important for two reasons:

First, if you screw up a rebase the old branch is still there. You can find it with git reflog or using ORIG_HEAD.

Second, and most important, a rebase results in new commit IDs. Everything in Git works by an ID. This is why, if you rebase a shared branch, it introduces complications.

There's A LOT more to say about rebasing vs. merging, so I'll leave it at this:

  • To update a branch, use rebase. This avoids messy intermediate merges.
  • To finish a branch...
    • Update it using rebase.
    • Then use merge --no-ff to force a merge commit to be created.
    • Then delete the feature branch, never use it again.

The end result you want to see in your history is a "feature bubble".

                      G1 - H1 - I1
                     /            \
A - B - C - D - E - F ------------ J [master]

This keeps history linear while still giving code archeologists the important context that G1, H1, and I1 were done as part of a branch and should be examined together.


Stashing is something completely different. It's basically a special branch to store patches.

Sometimes you're in the middle of something and it's not ready to commit but you need to do some other work. You could put it in a patch file with git diff > some.patch, reset your working directory, do the other work, commit it, then apply some.patch. Or you can git stash save and later git stash pop.

like image 112
Schwern Avatar answered Oct 12 '22 23:10

Schwern