Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Making two branches identical

I have two branches - A and B. B was created from A.

Work on both branches continued in parallel. Work on branch A was bad (resulting in a non-working version) while work on branch B was good. During this time branch B was sometimes merged into branch A (but not the other way around).

Now I want to make branch A identical to branch B. I can't use git revert because I need to revert too many commits - I only want to revert the commits that were done on branch A but not as a result of merging branch B.

The solution I found was to clone branch B into a another folder, delete all the files from the working folder of branch A, copy the files from the temp branch B folder and add all untracked files.

Is there a git command that does the same thing? Some git revert switch I missed?

like image 832
zmbq Avatar asked Mar 30 '16 22:03

zmbq


People also ask

How do you replicate a branch?

There are two ways to clone a specific branch. You can either: Clone the repository, fetch all branches, and checkout to a specific branch immediately. Clone the repository and fetch only a single branch.

What does it mean to merge two branches?

When you perform a merge, you effectively merge one branch into another—typically a feature branch or bug fix branch into a main branch such as master or develop. Not only will the code changes get merged in, but also all the commits that went into the feature branch.


2 Answers

There are a lot of ways to do this and which one you should use depends on what result you want, and particular what you and anyone collaborating with you (if this is a shared repository) expect to see in the future.

The three main ways to do this:

  1. Don't bother. Abandon branch A, make a new A2, and use that.

  2. Use git reset or equivalent to re-point A elsewhere.

    Methods 1 and 2 are effectively the same in the long run. For instance, suppose you start by abandoning A. Everyone develops on A2 instead for a while. Then, once everyone is using A2, you delete the name A entirely. Now that A is gone, you can even rename A2 to A (everyone else using it will have to do the same rename, of course). At this point, what, if anything, looks different between the case where you used method 1 and the case where you used method 2? (There is one place you may still be able to see a difference, depending on how long your "long run" is and when you expire old reflogs.)

  3. Make a merge, using a special strategy (see "alternative merge strategies" below). What you want here is -s theirs, but it doesn't exist and you must fake it.

Side note: where a branch is "created from" is basically irrelevant: git doesn't keep track of such things and in general it is not possible to discover it later, so it probably should not matter to you either. (If it really does matter to you, you can impose a few restrictions on how you manipulate the branch, or mark its "start point" with a tag, so that you can recover this information mechanically later. But this is often a sign that you are doing something wrong in the first place—or at the least, expecting something of git that git does not provide. See also footnote 1 below.)

Definition of branch (see also What exactly do we mean by branch?)

A branch—or more precisely, a branch name—in git is simply a pointer to some specific commit within the commit graph. This is also true of other references like tag names. What makes a branch special, as compared with a tag for instance, is that when you are on a branch and make a new commit, git automatically updates the branch name so that it now points to the new commit.

For instance, suppose we have a commit graph that looks something like this, where o nodes (and the one marked * as well) represent commits and A and B are branch-names:

o <- * <- o <- o <- o <- o   <-- A       \        o <- o <- o           <-- B 

A and B each point to the tip-most commit of a branch, or more precisely, the branch data structure formed by starting at some commit and working back through all reachable commits, with each commit pointing to some parent commit(s).

If you use git checkout B so that you're on branch B, and make a new commit, the new commit is set up with the previous tip of B as its (single) parent, and git changes the ID stored under the branch name B so that B points to the new tip-most commit:

o <- * <- o <- o <- o <- o   <-- A       \        o <- o <- o <- o      <-- B 

The commit marked * is the merge base of the two branch tips. This commit, and all earlier commits, is on both branches.1 The merge base matters for, well, merging (duh :-) ), but also for things like git cherry and other release-management type operations.

You mention that branch B was occasionally merged back into A:

git checkout A; git merge B 

This makes a new merge commit, which is a commit with two2 parents. The first parent is the previous tip of the current branch, i.e., the previous tip of A, and the second parent is the named commit, i.e., the tip-most commit of branch B. Redrawing the above a bit more compactly (but adding some more -s to make the os line up better), we start with:

o--*--o--o--o--o      <-- A     \      o---o--o---o     <-- B 

and end up with:

o--*--o--o--o--o--o   <-- A     \            /      o---o--o---*     <-- B 

We move the * to the new the merge-base of A and B (which is in fact the tip of B). Presumably we then add some more commits to B and maybe merge a few more times:

...---o--...--o--o    <-- A      /       / ...-o--...--*--o--o   <-- B 

What git does by default with git merge <thing>

To make a merge, you check out some branch (A in these cases) and then run git merge and give it at least one argument, typically another branch name like B. The merge command starts by turning the name into a commit ID. A branch name turns into the ID of the tip-most commit on the branch.

Next, git merge finds the merge base. These are the commits we have been marking with * all along. The technical definition of the merge base is the Lowest Common Ancestor in the commit graph (and in some cases there may be more than one) but we'll just go with "the commit marked *" here for simplicity.

Last, for ordinary merges, git runs two git diff commands.3 The first git diff compares commit * against the HEAD commit, i.e., the tip of the current branch. The second diff compares commit * against the argument commit, i.e., the tip of the other branch (you can name a specific commit and it need not be the tip of a branch, but in our case we're merging B into A so we get those two branch-tip commits).

When git finds some file modified, as compared to the merge-base version, in both branches, git tries to combine those changes in a semi-smart (but not really smart) way: if both changes add the same text to the same region, git keeps one copy of the addition. If both changes delete the same text in the same region, git deletes that text just once. If both changes modify text in the same region, you get a conflict, unless the modifications match exactly (then you get one copy of the modifications). If one side makes one change and the other side makes a different change, but the changes seem not to overlap, git takes both changes. This is the essence of a three-way merge.

Last, assuming all goes well, git makes a new commit that has two (or more as we already noted in footnote 2) parents. The work-tree associated with this new commit is the one git came up with when it did its three-way merge.

Alternative merge strategies

While git merge's default recursive strategy has -X options ours and theirs, they do not do what we want here. These simply say that in the case of a conflict, git should automatically resolve that conflict by choosing "our change" (-X ours) or "their change" (-X theirs).

The merge command has another strategy entirely, -s ours: this one says that instead of diffing the merge base against the two commits, just use our source tree. In other words, if we're on branch A and we run git merge -s ours B, git will make a new merge commit with the second parent being the tip of branch B, but the source tree matching the version in the previous tip of branch A. That is, the code for the new commit will exactly match the code for its parent.

As outlined in this other answer, there are a number of ways to force git to effectively implement -s theirs. I think the simplest to explain is this sequence:

git checkout A git merge --no-commit -s ours B git rm -rf .         # make sure you are at the top level! git checkout B -- . git commit 

The first step is to ensure that we are on branch A, as usual. The second is to fire up a merge, but avoid committing the result yet (--no-commit). To make the merge easier for git—this is not needed, it just makes things faster and quieter—we use -s ours so that git can skip the diff steps entirely and we avoid all merge conflict complaints.

At this point we get to the meat of the trick. First we remove the entire merge result, since it is actually worthless: we do not want the work-tree from the tip commit of A, but rather the one from the tip of B. Then we check out every file from the tip of B, making it ready to commit.

Last, we commit the new merge, which has as its first parent the old tip of A and as its second parent the tip of B, but has the tree from commit B.

If the graph just before the commit was:

...---o--...--o--o    <-- A      /       / ...-o--...--*--o--o   <-- B 

then the new graph is now:

...---o--...--o--o--o   <-- A      /       /     / ...-o--...--o--o--*     <-- B 

The new merge-base is the tip of B as usual, and from the perspective of the commit graph, this merge looks exactly like any other merge. What's unusual is that the source tree for the new merge at the tip of A exactly matches the source tree for the tip of B.


1In this particular case, since the two branches have remained independent (never been merged), it's also probably the point where one of the two branches was created (or maybe even where both were created), although you can't prove that at this point (because someone may have used git reset or various other tricks to move the branch labels around in the past). As soon as we start merging, though, the merge base is clearly no longer the starting point, and a sensible starting point gets more difficult to locate.

2Technically, a merge commit is any commit with two or more parents. Git calls merges with more than two parents "octopus merges". In my experience, they are not common except in the git repository for git itself, and in the end, they achieve the same thing as multiple ordinary two-parent merges.

3The diffs are usually done internally to some extent, rather than running actual commands. This allows a lot of short-cut optimizations. It also means that if you write a custom merge driver, that custom merge driver is not run unless git finds that the file is modified in both diffs. If it's only modified in one of the two diffs, the default merge simply takes the modified one.

like image 170
torek Avatar answered Sep 29 '22 08:09

torek


Checkout your target branch:

git checkout A; 

To remove everything in branch A and make it to be the same as B:

git reset B --hard; 

If you need to discard all your local changes (kind of revert, but doesn't mess up git like git revert):

git reset HEAD --hard; 

When you are done, don't forget to update your remote branch (use --force or -f flag in case you need to override history).

git push origin A -f; 
like image 20
Pianov Avatar answered Sep 29 '22 08:09

Pianov