In Gitk I can see a team member's commit (X) that has two parents, the first parent is his own previous commit (A), the other parent contains lots of other people commits (1 through 5). After his merge all changes made by other people (1 through 5 and others) are no longer present at X, B, C, etc...
A------------
\
X - B - C
/
1--2--3--4--5
/
e--r--j--k
/
l--m
If I diff commit X to commit A it shows no differences, if I diff commit X to commit 5 it shows all the missing changes. Also, at commit X, B, or C git log does not show changes that were made to files in commits 1 through 5. However, if I do git log --full-history then history does show the changes that were made in 1 through 5, but those changes are not still present in the actual file and history does not show them being being undone. So git log --full-history seems to contradict the current file contents.
I talked to the user who made commit X. He says he did not do a reset or rebase and he says he hasn't reverted any commits during the time in question. However, he says that he does sometimes do a pull origin master that results in everyone else's changes getting put in his index or working tree as if he had made those changes and not the actual authors of those changes. He says when that happens he does a fresh clone and does not push anything from that local repo to master because he believes Git has done something wrong.
Are the two things related (bad pull and bad merge)?
How can I tell exactly what happened so that we can avoid this in the future?
And what causes Git to sometimes put changes pulled from origin master to be placed in the local working directory or index as if they were local changes?
However, he says that he does sometimes do a pull origin master that results in everyone else's changes getting put in his index or working tree as if he had made those changes and not the actual authors of those changes.
It sounds like he was getting merge conflicts but does not understand what they are. This is an extremely common problem, and unfortunately, we don't know a good way to avoid it (switching back to SVN doesn't avoid it, for example).
Let's call your developers Alice and Bob. Alice made commits 1-5, and Bob made commits A and X. Here is a plausible history.
Bob makes commit A.
Alice makes commits 1-5, and pushes them to the central repository.
Bob tries to push A, but can't, because his repository is out of date.
$ git push
! [rejected] master -> master (non-fast-forward)
Bob then he does what you told him to do: he pulls first. However, he gets a merge conflict because commit A and commits 1-5 touch some of the same code.
$ git pull
Auto-merging file.txt
CONFLICT (content): Merge conflict in file.txt
Automatic merge failed; fix conflicts and then commit the result.
Bob sees other people's changes in his working directory, and doesn't understand why the changes are there.
$ git status
both modified: file.txt
He thinks Git is doing something wrong, when in fact, Git is asking him to resolve a merge conflict. He tries to check out a fresh copy, but gets an error:
$ git checkout HEAD file.txt
error: path 'file.txt' is unmerged
Since it doesn't work, he tries -f
:
$ git checkout -f HEAD file.txt
warning: path 'file.txt' is unmerged
Success! He commits and pushes.
$ git commit
$ git push
There are a lot of git tools out there. Seriously. Visual Studio and Xcode both come with Git integration, there are several other GUIs, and there are even multiple command-line clients. People are also sloppy with the way they describe how they use Git, and most developers are not quite comfortable enough with how Git works outside of the "pull commit push" workflow.
There was an excellent paper on this very subject not too long ago (I'm having a hard time finding it). Some of the conclusions were (forgive my memory):
Most developers don't really know how to use source control, except for a few really simple commands (commit, push).
When source control doesn't behave the way developers expect, they resort to tactics such as copy-pasting some command they don't quite understand to "fix things", adding the -f
flag, or erasing the repository and starting again with a clean copy.
On development teams, it is often the case that only the lead developers really know what is going on in the repo.
So this is really an educational challenge.
I think the key lesson here that Bob needs to learn is that git pull
is really just git fetch
and git merge
, and that you can get merge conflicts, and you need to act in a very conscientious and purposeful manner when resolving merges. This applies even when there are no reported conflicts... but let's not blow Bob's mind too much for now!
The other key lesson here is that lead developers need to take the time to ensure that everyone on the team can use source control correctly, and understands how pulling, pushing, branching, and merging are all related. This is a great opportunity for a lunchtime lecture: put together some slides, buy pizza, and talk about how Git works.
There's several ways to get the behavior of things being in their index. A pull is a fetch then a merge. That merge can result in a conflict which would look as you described with other people's changes in your index. A user who doesn't understand conflict management can cause a lot of damage and the result could be the bad merge.
Otherwise, they'd have to pass extra flags to git pull
like --no-commit
to make it behave as they describe.
Here's how I'd investigate...
Users are notorious for not reporting all the information. I'd find out exactly what they're doing when the problem happens, ask them to copy their terminal history when it happens. Their shell history or reflog might be interesting, too.
Check their configuration. I would look at their ~/.gitconfig
, project/.git/config
and env | grep GIT
to see if there's anything funny.
I'd also find out if they're using git on the command line or some tool, the tool could be causing the problem.
Find out what version of git they're using, maybe it's an old or buggy release (though I have yet to encounter a situation caused by a git bug).
Check their remotes, it's possible they've got some other repository mixed in somehow.
Does the repository have any hooks? If so, are they using utilities that might not be working as expected on the user's machine?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With