Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Who is `them` and `us` in a `git revert`?

I can't make sense of who us and them are in these conflicts during a git revert, so I don't really know what's happening here:

git revert some_commit_hash

Then git status shows the following conflicts:

deleted by them: path/to/file1.h
both modified:   path/to/file2.h
deleted by them: path/to/file1.cpp
deleted by them: path/to/test_file1.cpp
added by us:     path/to/file3.h
deleted by them: path/to/file4.h
added by us:     path/to/file5.h

Who is "us"? Who is "them"?

Update: note that the commit I'm reverting is a very large merge commit.


NOT duplicates:

  1. because it doesn't clarify who is us and them: GIT: How dangerous is "deleted by us" conflict?
  2. because it covers merge and rebase but NOT revert, and git frequently uses the same terms to mean opposite things depending on the operation: Who is "us" and who is "them" according to Git?
  3. because it doesn't mention "us" and "them" - Git - reverting a revert, conflicts
like image 418
Gabriel Staples Avatar asked Sep 15 '20 18:09

Gabriel Staples


People also ask

Who is us and them in git rebase?

When you rebase, us refers the upstream branch, and them is the branch you're moving about. It's a bit counter-intuitive in case of a rebase. The reason is that Git uses the same merge-engine for rebase, and it's actually cherry-picking your stuff into the upstream branch.

What does it mean deleted by us in git?

'deleted by us' means the file is deleted in the commit which you are trying to do a cherry-pick. It is not file is deleted by you. Git tells that the file was deleted in some other commit, and allows you to decide to retain it (git add) or to remove it. You can do git cherry-pick --continue once you sort this out.

What is the git revert command?

The git revert command is a forward-moving undo operation that offers a safe method of undoing changes. Instead of deleting or orphaning commits in the commit history, a revert will create a new commit that inverses the changes specified. Git revert is a safer alternative to git reset in regards to losing work.


Video Answer


3 Answers

When a conflict occurs, the rule that applies in all situations is :

  • ours/us is the state of the current HEAD (the active commit)
  • theirs/them is the state of the other side (the commit being merged, the commit being cherry-picked/rebased, or in your case the "reverse" of the commit you want to revert)

Some extra clarifications in the case of a rebase (answering @GabrielStaples' comment) :

if you are on my/branch, and you run git rebase other/branch, git will checkout the head commit of other/branch and start replaying some commits on top.

If a conflict occurs, since the checked out commit comes from other/branch, ours will roughly represent other/branch, and theirs will be my/branch.

This part is contrary to the intuition "ours should be my changes", but it fits the above description : at the time of the conflict, the checked out commit is ours, the other side (the commit being replayed) is theirs.

like image 113
LeGEC Avatar answered Oct 21 '22 10:10

LeGEC


Although this is already answered pretty well, there's one more way to look at it all. That's the way that Git itself looks at it. All four operations—cherry-pick, merge, rebase, and revert—use the same machinery, and the --ours and --theirs flags to git checkout, and the -X ours and -X theirs extended-options, wind up referring to the same things, using the same internal code. I like to refer to this machinery as merge as a verb, because we first get introduced to it through git merge, when merge must do a real merge.

The merge case

When doing a real merge, the terms make sense. We start with what can be illustrated this way:

          I--J   <-- ourbranch (HEAD)
         /
...--G--H
         \
          K--L   <-- theirbranch

Here, the name ourbranch selects commit J, which is our commit on our branch (one of two such commits in this case, though the number of commits that are exclusively on our own branch need only be at least 1 to force a real merge). The name theirbranch selects commit L, which is their commit on their branch (again one of two, with at least one commit being necessary here).

What Git does in order to do this merging—to merge as a verb some set of files—is, for each file in all three commits H, J, and L, compare the file in H vs that in J to see what we changed, and compare the file in H vs that in L to see what they changed. Then Git combines these two sets of changes, applying the combined changes to whatever is in H.

Commit H is the merge base commit, commit J is the "ours" commit, and commit L is the "theirs" commit. Any difference, whether it's a new file "added by us", or a file "deleted by them", or whatever, is with respect to commit H.

In order to run the merge through the merge machinery, Git does a slightly-optimized-in-advance version of the following:

  1. Setup:

    • read merge base commit (H) into index at slot 1
    • read ours commit (HEAD = J) into index at slot 2
    • read theirs commit (L) into index at slot 3
  2. Identify "same files". Note that steps 2 and 3 repeat for every file.

    • if there's a file named F in all three slots, it's the same file
    • otherwise, if there's anything in slot 1, try to guess about renames, which will tie a merge base file in slot 1 to an ours or theirs file of a different name that's in slot 2 and/or slot 3; if no file can be found to call a rename, the ours and/or theirs side deleted this file; these cases may also lead to high level conflict such as rename/modify or rename/delete, where we declare a conflict and move on without doing step 3
    • otherwise (nothing in slot 1, but something in slots 2 and 3) we have an add/add conflict: declare this particular file to be in conflict, and move on without doing step 3
  3. Short circuit easy cases, and do the hard cases with a low-level merge:

    • if the blob hash IDs in slots 1, 2, and 3 all match, all three copies are the same; use any of them
    • if the blob hash ID in slot 1 matches that in 2 or 3, someone didn't change the file and someone did; use the changed file, i.e., take the file that's different
    • otherwise, all three slots differ: do a changed-block-of-lines by changed-block, low-level merge
      • if there's a merge conflict during the low level merge, -X ours or -X theirs means "resolve the conflict using ours/theirs" where ours is whatever is in slot 2 and theirs is whatever is in slot 3
      • note that this means wherever there is no conflict, e.g., only one "side" changed line 42, the -X extended option does not apply at all, and we take the modification, regardless of whether is ours or theirs

At the end of this process, any fully resolved file is moved back to its normal slot-zero position, with the slot 1, 2, and 3 entries being removed. Any unresolved file is left with all three index slots occupied (in delete conflicts and add/add conflicts, some slots are empty, but some nonzero stage number slot is in use, which marks the file as being conflicted).

Hence the to merge or merge as a verb operates in Git's index

All of the action above happens in Git's index, with the side effect of leaving updated files in your work-tree. If there are low-level conflicts, your work-tree files are marked up with the conflict markers and the various sections from lines corresponding to the copies of the files that are in index slots 1 (merge base), 2 (ours), or 3 (theirs).

Ultimately it always boils down to that same equation: 1 = merge base, 2 = ours, 3 = theirs. This holds true even when the command that loads the index is not git merge.

Cherry-pick and revert use the merge machinery

When we run git cherry-pick, we have a commit graph that looks like this:

...--P--C--...
   \
    ...--H   <-- somebranch (HEAD)

The letters P and C here stand in for any parent-and-child pair of commits. C can even be a merge commit, as long as we use the -m option to specify which parent to use. (There's no real constraint on where the three commits live in the graph: I've drawn it with H a child of some commit that comes before P, but it can be after the P-C pair, as in ...-E-P-C-F-G-H for instance, or there may be no relationship at all between the P-C and H commits, if you have multiple disjoint subgraphs.)

When we run:

git cherry-pick <hash-of-C>

Git will locate commit P on its own, using the parent link from C back to P. P now acts as the merge base, and is read into index slot 1. C acts as the --theirs commit, and is read into index slot 3. Our current commit H is the --ours commit, and is read into index slot 2. The merge machinery runs now, so "our" commit is HEAD and "their" commit is commit C, with the merge base—which shows up if we set merge.conflictStyle to diff3, or if we use git mergetool to run a merge tool—being commit P.

When we run:

git revert <hash-of-C>

the same thing happens, except this time, commit C is the merge base in slot 1, and commit P is the --theirs commit in slot 3. The --ours commit in slot 2 is from HEAD as usual.

Note that if you use cherry-pick or revert on a range of commits:

git cherry-pick stop..start

the cherry-picking works one commit at a time using the topologically older commits first, while the reverting works one commit at a time using the topologically newer commits first. That is, given:

...--C--D--E--...
 \
  H   <-- HEAD

a git cherry-pick C..E copies D first, then E, but a git revert C..E reverts E first, then D. (Commit C does not come into play because the two-dot syntax excludes the commits reachable from the left side of the two-dot expression. See the gitrevisions documentation for more.)

Rebase is repeated cherry-picking

The rebase command works by running git cherry-pick repeatedly, after using git checkout --detach or git switch --detach to go into detached HEAD mode. (Technically it now just does this internally; in the old days, some of the shell script based version of git rebase really did use git checkout, though with a hash ID which always went to detached mode anyway.)

When we run git rebase, we start with something like this:

       C--D--E   <-- ourbranch (HEAD)
      /
...--B--F--G--H   <-- theirbranch

We run:

git checkout ourbranch   # if needed - the above says we already did that
git rebase theirbranch   # or, git rebase --onto <target> <upstream>

The first—well, second—thing this does is enter detached HEAD mode, with the HEAD commit being the commit we selected with our --onto argument. If we did not use a separate --onto flag and argument, the --onto is from the one argument we did give, in this case, theirbranch. If we did not use a separate upstream argument, the one argument we gave—in this case theirbranch—is used for both purposes.

Git also (first, which is why the above is second) lists out the raw hash IDs of each commit that is to be copied. This list is much more complicated than it seems at first blush, but if we ignore the extra complications, it's basically the result of:

git rev-list --topo-order --reverse <hash-of-upstream>..HEAD

which in this case is the hash IDs of commits C, D, and E: the three commits that are reachable from ourbranch that are not also reachable from theirbranch.

With git rebase having generated this list and gone into detached-HEAD mode, what we have now looks like this:

       C--D--E   <-- ourbranch
      /
...--B--F--G--H   <-- theirbranch, HEAD

Now Git runs one git cherry-pick. Its argument is the hash ID of commit C, the first commit to be copied. If we look above at how cherry-pick works, we see that this is a merge-as-a-verb operation, with the merge base being the parent of C, i.e., commit B, the current or --ours commit being commit H, and the to-be-copied or --theirs commit being commit C. So that's why ours and theirs seem reversed.

Once this cherry-pick operation is complete, however, we now have:

       C--D--E   <-- ourbranch
      /
...--B--F--G--H   <-- theirbranch
               \
                C'  <-- HEAD

Git now proceeds to copy commit D with git cherry-pick. The merge base is now commit C, the --ours commit is commit C', and the --theirs commit is D. This means that both the ours and theirs commits are ours, but this time the "ours" commit is one we just built a few seconds (or milliseconds) ago!

It's based on existing commit H, which is theirs, but it's commit C', which is ours. If we get any merge conflicts, they're no doubt a result of being based on H, perhaps including some sort of conflict resolution we performed manually in order to make C'. But, quite literally, all three input commits are ours. Index slot #1 is from commit C, index slot #2 is from commit C', and index slot #3 is from commit D.

Once we have this all done, our picture is now:

       C--D--E   <-- ourbranch
      /
...--B--F--G--H   <-- theirbranch
               \
                C'-D'  <-- HEAD

Git now runs git cherry-pick on the hash of commit E. The merge base is commit D, and the ours and theirs commits are D' and E respectively. So once again, during rebase, all three commits are ours—though merge conflicts are probably a result of building on H.

When the last cherry-pick is done, Git finishes the rebase by yanking the name ourbranch off old commit E and pasting it on to new commit E':

       C--D--E   [abandoned]
      /
...--B--F--G--H   <-- theirbranch
               \
                C'-D'-E'  <-- ourbranch (HEAD)

We are now back in the normal attached-head mode of working, and because git log starts where we are now—at commit E'—and works backwards, which never visits original commit C, it seems as though we've somehow modified the original three commits. We have not: they are still there, in our repository, available through the special pseudo-ref ORIG_HEAD and available via our reflogs. We can get them back for at least 30 days by default, after which git gc will feel free to reap them and then they'll really be gone. (Well, as long as we didn't git push them to some other Git repository that's still keeping them.)

like image 34
torek Avatar answered Oct 21 '22 12:10

torek


TLDR;

Jump to the very bottom for the results and conclusion.

Details:

Regarding:

Then git status shows the following conflicts:

deleted by them: path/to/file1.h
both modified:   path/to/file2.h
deleted by them: path/to/file1.cpp
deleted by them: path/to/test_file1.cpp
added by us:     path/to/file3.h
deleted by them: path/to/file4.h
added by us:     path/to/file5.h

I did some experimenting, and observed the following.

First, I manually resolved only the conflicts in the both modified file, path/to/file2.h, as normal for any rebase or merge conflict. I then added all files and finished the revert:

git add -A
git revert --continue

Next, I observed that all files marked with deleted by them, as well as all files marked with added by us, were present / in existence on my file system. So, the revert deleted none of them. Next, I wanted to know: which commit created these files? To see this, run the following (source):

git log --diff-filter=A -- path/to/file

This shows the git log commit_hash for just the one single commit_hash which created this file. I did this one-at-a-time for each file which was deleted by them or added by us:

git log --diff-filter=A -- path/to/file1.h        # added by the commit I reverted
git log --diff-filter=A -- path/to/file1.cpp      # added by the commit I reverted
git log --diff-filter=A -- path/to/test_file1.cpp # added by the commit I reverted
git log --diff-filter=A -- path/to/file3.h        # added by a later commit
git log --diff-filter=A -- path/to/file4.h        # added by the commit I reverted
git log --diff-filter=A -- path/to/file5.h        # added by a later commit

I found that 4 of the files, as indicated above, were added by the commit I reverted. Note, this means they were added by the commit some_commit_hash itself, NOT by the revert commit which was created when I ran git revert some_commit_hash. So, why did they still exist if I reverted that commit? Well, it turns out, a later commit, which we will call later_commit_hash, which happened AFTER some_commit_hash, touched all 6 of those files, modifying 4 of them and creating 2 of them.

Let's group the above files by groups of deleted by them vs added by us:

# deleted by them:
path/to/file1.h
path/to/file1.cpp
path/to/test_file1.cpp
path/to/file4.h

# added by us:
path/to/file3.h
path/to/file5.h

Now indicate which file was added by which commit:

# deleted by them / added by the commit I reverted (`some_commit_hash`)
path/to/file1.h
path/to/file1.cpp
path/to/test_file1.cpp
path/to/file4.h

# added by us / added by a later commit (`later_commit_hash`)
path/to/file3.h
path/to/file5.h

So, you can see that deleted by them files were added by the commit I reverted, which means that reverting that commit will delete those files! So, them refers to the commit being reverted, some_commit_hash, while us refers to the remaining commits at HEAD.

The conflict was that later_commit_hash touched those 4 "deleted by them" files, so the git revert some_commit_hash wasn't allowed to delete them. And, the 2 "added by us" files did NOT exist prior to some_commit_hash, so the conflict was that they shouldn't have existed after the revert, but they did, because they were created by later_commit_hash.

The solution I did is I manually deleted all those 6 files:

rm path/to/file1.h
rm path/to/file1.cpp
rm path/to/test_file1.cpp
rm path/to/file3.h
rm path/to/file4.h
rm path/to/file5.h

then I committed this change as a new commit:

git add -A
git commit

However, I could have instead reset back to the location prior to the revert commit and reverted later_commit_hash first, followed by reverting some_commit_hash second, effectively rolling these changes back in order, like this:

git reset --hard HEAD~  # WARNING! DESTRUCTIVE COMMAND! BE CAREFUL.
git revert later_commit_hash
git revert some_commit_hash
# should result in no conflicts during both of those reverts now

Results and Conclusions:

In either case, to answer my own question:

During git revert some_commit_hash:

  1. "us" = the currently-checked out commit (ie: HEAD) at the time you type and run git revert some_commit_hash, and:
  2. "them" = the (inverse or opposite of?) the commit you are reverting; ie: it is some ephemeral commit which is the opposite of some_commit_hash, in order to undo some_commit_hash's changes, assuming you run the command git revert some_commit_hash.

Update 7 Jan. 2020: yes, this does indeed seem to be it. Here's my comment I just left underneath this other answer here. My comment seems to correlate with the above observation perfectly:

The key takeaway for me here regarding git revert is, I think, that if you have a linear tree ...A--B--C--D(HEAD), with D being your current HEAD, & you do a git revert B, then B, the very commit you are trying to revert, becomes the current merge-base, or Slot 1 in this "merge", and Slot 2, or "ours", becomes D/HEAD, and Slot 3, or "theirs", becomes A, or the parent of the commit being reverted, correct? Then, the low-level "merge" is carried out, resulting in applying all changes from B..D, as well as all changes from B..A, thereby reverting B, correct? This is hard.

So, that means this "ephemeral commit which is the opposite of some_commit_hash" is really just the inverse diff, or a diff in the direction of from some_commit_hash you are reverting to its parent commit. Now, you have a low-level git merge going on under the hood, where the merge-base is some_commit_hash to revert, "ours"/"us" is HEAD, and "theirs"/"them" is the parent of some_commit_hash, AKA: some_commit_hash~. As git does this low-level merge, the diff from some_commit_hash to HEAD (ie: the equivalent of git diff some_commit_hash..HEAD) captures all your new content, and the diff from some_commit_hash to its parent (ie: the equivalent of git diff some_commit_hash..some_commit_hash~) captures the reverse of the changes done by commit some_commit_hash, thereby reverting this commit!

If I've got this all straight, it all makes perfect sense now!


I'm still struggling a bit with this concept but that's the gist of it. The exact mechanics of how revert works would really clarify things here I think. This answer may offer some more insight, but I don't understand it.

I've also just added an answer to here to clarify "us" and "them" for all 4 git operations I can think of where this may happen: git merge, git cherry-pick, git rebase, and git revert: Who is "us" and who is "them" according to Git?


(Notes to self):

Need to take a look at: http://ezconflict.com/en/conflictsse12.html#x53-890001.7

like image 36
Gabriel Staples Avatar answered Oct 21 '22 11:10

Gabriel Staples