Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does git (sometimes) clone a commit when merging?

I'm aware that there are some scenarios that lead to Git commits being duplicated, for example a git cherry-pick. If a commit is cherry-picked and "merged" into another branch, it appears twice in the commit graph, with two commit hashes.

Is it possible that Git duplicates a commit during a git merge operation?

The reason I'm asking is the following commit graph (generated in TortoiseGit):

enter image description here

The commits are listed in 'date-order' here (author date).

After I committed efb916.. in the green branch (which was my master branch then), I merged the green branch into the red branch (which is a local side branch). This appears as a regular merge in the graph. So far so good.

After that, I clicked the Sync button in Github for Windows to sync my local master to the remote origin/master. This pulled the commit 09067c.. from the remote branch and then merged or rebased the local master, so commit 09067c.. was followed by commit efb916... However, instead of really merging efb916.. onto 09067c.., Git duplicated commit efb916.. and gave the duplicate the new hash 48e314...

In the end, my master pointed at 48e314.. (the black branch in the graph) and my local side branch pointed at 68b78a.. (the red branch). The contents, date, author and message of the commits efb916.. and 48e314.. are exactly the same.

This has happened several times, sometimes with just one commit being duplicated, sometimes with several commits.

Why did git duplicate the commit efb916..? How can I prevent it?

EDIT: As an additional note: I find it strange that my master originally pointed at efb916.., but after the Github Sync efb916.. was no longer in the commit history of master.

like image 261
cheesus Avatar asked Mar 19 '23 11:03

cheesus


1 Answers

So, I'm going to break this down into a simple view first. But it's not entirely accurate, but it's easier to get a grasp. Then I'll give you resources for further reading afterward.

The Simple View

The problem is that you're confused about what a commit really is. The commit hash is not a hash of your patch, it's more of a hash of the system as a whole. So when you merge, it's functionally applying all the patches in both (or more) branches that you're merging, coming up with a new complete code tree, and creating it in the object tree with a hash that is representational of the current state.

A rebase is similar: you're functionally moving all the patches in the tree and the result will be different. They're different because the order the patches are applied matters and affects the hash. You can actually do this as an easy test to show the difference:

Create a new repository with a single file:

# echo "a" >> file
# git init
Initialized empty Git repository in /home/hardaker/tmp/h/test/.git/
# (master #): git add file
# (master #): git commit -m "new file"
[master (root-commit) 9617f27] new file
 1 file changed, 1 insertion(+)
 create mode 100644 file

Now, let's branch it:

# (master): git checkout -b new-branch
Switched to a new branch 'new-branch'

Add a new file in the branch:

# (new-branch): echo "file2" > file2
# (new-branch): git add file2 
P# (new-branch +): git commit -m "new file2" file2 
[new-branch 04f3fdf] new file2
 1 file changed, 1 insertion(+)
 create mode 100644 file2

Check out the master again and make a change there:

# (new-branch): git checkout master
Switched to branch 'master'
# (master): echo "b" >> file 
# (master *): git commit -m "added b" -a
[master 3b3de88] added b
 1 file changed, 1 insertion(+)

Now, lets display the log as a tree form and see what we have so far:

# (master):  git log --oneline --graph --all --decorate
* 3b3de88 (HEAD, master) added b
| * 04f3fdf (new-branch) new file2
|/  
* 9617f27 new file

A nice tree view, with two branches. Note the commit ID of the new-branch, because it's about to change when we rebase new-branch onto master:

# (master): git checkout new-branch 
Switched to branch 'new-branch'
# (new-branch): git rebase master
 file | 1 +
 1 file changed, 1 insertion(+)
First, rewinding head to replay your work on top of it...
Applying: new file2

And the branch head no longer has the same commit id because you've changed how the system is constructed.

# (new-branch):  git log --oneline --graph --all --decorate
* d9918d4 (HEAD, new-branch) new file2
* 3b3de88 (master) added b
* 9617f27 new file

Now, for good reading fun on what a commit really is, and what objects in the tree really are, read The Git Objects web page. You'll learn oh so much in just a few pages.

like image 111
Wes Hardaker Avatar answered Mar 21 '23 06:03

Wes Hardaker