Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git Merge: parallel identical additions are merged without conflict by including the same code block TWICE in the merged file without complaint. Why?

Tags:

git

merge

I have a slightly tricky history in my Git log that I am trying to understand fully.

Before explaining the sequence of commits, let me paste images of the Git log (using SmartGit to visualize the history) for a file in question:

git_history_later

...

git_history_earlier

Shown is the Git history relevant to my question, with the irrelevant middle section snipped away.

Two developers made code changes to the file in question, Developer 1 and Developer 2.

Developer 1

  • checks out the branch labeled 'origin/staging' in the picture (but which was also 'origin/development' at the time it was checked out)
  • adds a small code block (see below)
  • commits and pushes - this push is labeled 'jon-dev-test-merge', above
  • makes a simple whitespace change to the code from the previous commit, and commits/pushes this whitespace change, labeled 'jon-dev-test-merge-2', above
  • realizes he should have been working on a separate branch, so starts over and checks out 'origing/staging' (at that time, 'origin/development'), into a new branch called 'jons_dev', which he pushes to, and sets up tracking for, on origin
  • adds the same small code block (NOT including the whitespace change) to this new branch (blue line) (into a commit sometime before the one labeled 'test-merge-sales')
  • later, seeing that Developer 2 has made a change on the 'origin/development' branch, merges 'origin/development' into his 'jons_dev' branch

Meanwhile...

Developer 2

  • checks out the branch labeled 'origin/staging' in the picture (but which was also 'origin/development' at the time it was checked out)
  • makes a code change to another file (completely unrelated to the code change from Developer 1, above) - this is in his working copy on his local development machine
  • pulls Developer 1's code changes to the 'origin/development' branch to his local machine and merges into a local-only branch/working copy; the merge succeeds without conflict. Note that we don't see Developer 2's merge into his local branch (because he did not push that branch to the origin), but only the merge (see next bullet point) from his local branch back into what was the 'origin/development' branch at that time.
  • merges into the tracked development branch from origin on his local machine, and pushes the merge back up to origin - labeled 'vladimir-test-merge', above

So far, so good.

Here is my question.

In the process of understanding the sequence of events above, I noticed something odd - no changes from the 'origin/development' branch for the file in question needed to be incorporated into the merge labeled 'origin/development' - whose explanation turned out to be that in this file, the identical changes to this file, made in parallel, not including whitespace, were therefore present in both files so that just the changes from the 'jons_dev' branch were required (and this is how the merge was performed).

However, I noticed something from the merge, involving Git's method for merging and determining conflicts, that I cannot explain.

To demonstrate the issue in the simplest way for my question, I first created the test branches indicated in the screenshots - 'test-merge-sales', and 'jon-dev-test-merge' / 'jon-dev-test-merge-2'. I then checked out the branch 'test-merge-sales' and performed two separate merges into this branch (cancelling the merge in between the two tests).

The relevant results from these two merges are shown below. (Addendum: due to comments below the question, the second merge scenario is easily explained. However, the FIRST merge scenario is still a question.)


(0) BASE file

Before showing a 3-way screenshot from the two merges, here is a screenshot of the relevant section of the file as it existed in the 'origin/staging' branch BEFORE the branches diverged:

The base file: pre-branch-file

The image shows how the file looked for BOTH developers, BEFORE any changes were made by either developer.

There are also comments present, which explain exactly what code changes were made by each developer to arrive at the pre-merge state shown in Case (1), below.

As you can see from the comments, Developer 1 adds a function - print_customer_part_order_history, followed by a second function, print_sales_analysis_page, to a given spot in the file. In parallel, the other branch has added to it a single function - print_customer_part_order_history - in exactly the same place in the file. The code is identical, including whitespace.

This is the state of the file moving into Merge Case (1), below.


(1) Merge from 'jon-dev-test-merge' into the 'test-merge-sales' branch

Note: This merge scenario is my main question. Due to comments below the question, the question associated with the other merge scenario (#2, below) is already answered.

This merge did not result in a conflict. Opening a diff viewer for the merged file, here is a screenshot from the relevant (merged) lines:

merge_from_jon-dev-test-merge

(Click this link to full-size image)

Note that Git merged the files by including BOTH identical functions (added in parallel) 'print_customer_part_order_history()' - with no merge conflict. (This is the code snippet that Developer 1 added, in parallel, to the two branches.) Therefore, this function appears twice in the merged code.

Note: the 'test-merge-sales' branch has the same whitespace - leading spaces - in the highlighted code block in both branches.

Question 1: Why did Git decide there was no merge conflict? Two blocks of code were added in parallel at the same location in the file. Even though the blocks of code are identical, I would think that this should be a merge conflict.


(2) Merge from 'jon-dev-test-merge-2' into the 'test-merge-sales' branch

Note: Due to comments below this question, the question associated with this merge scenario is already answered.

merge_from_jon-dev-test-merge-2.jpg

(Click this link to full-size image)

The only difference in the code being merged is that Developer 1 changed leading spaces to tabs. However, in this case, with only the whitespace difference, Git has declared there is a merge conflict.

Question 2: Why - with only a difference in whitespace - would Git decide that in one case, there is no merge conflict, and in the other case, that there is a merge conflict?


My two questions are identified above, regarding how Git handles merges and merge conflicts.

Thanks!


ADDENDUM I have added an additional screenshot - the relevant text of the file BEFORE the two branches diverged - along with descriptive text, in the section called "(0) Base File". Thanks!

like image 365
Dan Nissenbaum Avatar asked Feb 26 '14 22:02

Dan Nissenbaum


1 Answers

It's pretty simple: when doing a merge, git analyze the lines that have changed on both size:

  • If the changes are less than 2 lines apart(Reference is coming soon) this will create a conflict, since the changes are very probably about the same thing;
  • If the changes are more than 2 lines apart the content from both side is considered as two different things and the content added is not analyzed, just added to the resulting file.

Since LHS added the function on line 79 and RHS added it on line 69 git thought that it was different content as it was more than a very few lines apart.


How to avoid this in the future?

  • Proceed to a diff between the two branches, this will be visible if you read your diff carefully and then edit it in the merge;
  • Communicate more within your team (if possible), is this normal that two developer wrote the exact same function in two separate branches?

From the git merge doc (emphasis is mine)

HOW CONFLICTS ARE PRESENTED

During a merge, the working tree files are updated to reflect the result of the merge. Among the changes made to the common ancestor’s version, non-overlapping ones (that is, you changed an area of the file while the other side left that area intact, or vice versa) are incorporated in the final result verbatim. When both sides made changes to the same area, however, Git cannot randomly pick one side over the other, and asks you to resolve it by leaving what both sides did to that area.

like image 143
Thomas Ayoub Avatar answered Nov 01 '22 21:11

Thomas Ayoub