I am having issues with merge conflicts due to line endings while working with someone who uses a different OS. I work on Windows and my colleague is on Mac. When he pushes his changes, sometimes files he hasn't worked on show up in the diff as being changed, because the line endings now show ^M
on each file. This has lead to merge conflicts. I read in the Git docs the following:
Git can handle this by auto-converting CRLF line endings into LF when you add a file to the index, and vice versa when it checks out code onto your filesystem. You can turn on this functionality with the core.autocrlf setting. If you’re on a Windows machine, set it to true — this converts LF endings into CRLF when you check out code:
$ git config --global core.autocrlf true If you’re on a Linux or macOS system that uses LF line endings, then you don’t want Git to automatically convert them when you check out files; however, if a file with CRLF endings accidentally gets introduced, then you may want Git to fix it. You can tell Git to convert CRLF to LF on commit but not the other way around by setting core.autocrlf to input:
$ git config --global core.autocrlf input This setup should leave you with CRLF endings in Windows checkouts, but LF endings on macOS and Linux systems and in the repository.
This makes sense, but I am still unclear on how the files are actually committed in the repo. For example, if he creates a file on his system, it will have all LF
line endings, correct? So when he commits, I presume those line endings are retained as-is. When I pull, my autocrlf
being true
will check them out with CRLF
line endings, as far as I understand. (I get the warnings warning: LF will be replaced by CRLF in <file x>; The file will have its original line endings in your working directory
)
A couple questions about this: when the warning says "working directory", what is that referring to? Also, when I then make changes, or create other files, all of which have the CRLF
line endings and commit+push, are they stored in the repo as CRLF
or LF
?
I imagine the ideal is to have the repo strip anything but LF
everytime a commit is made; is this what happens? What's going on under the hood and how can we force this to behave consistently?
Q1 Enforcing consistent lineendings
Q2 Enforcing at commit as well as checkout (comment)
I'll divide this into 2 parts: Practice and Principle
Expansion of code-apprentice's suggestion
autocrlf
— See why autocrlf is always wrong.
And here for the core git devs arguing about the ill-thoughtout-ness of autocrlf. Note particularly that the implementor is annoyed at the critic but doesn't deny the criticism..gitattributes
insteadsafecrlf=true
to enforce commit-cleanliness. safecrlf
is the answer to your Q2 – a file that would change on check-in check-out round tripping would error out on the check-in stage itself.When a new repo is init-ed:
Go through ls -lR
and choose for it's type text, binary
or ignore (ie put it in .gitignore)
Debugging:
Use git-check-attr to check that attribute matching and computation are as desired
We may treat git as a data-store loosely analogous to how a USB drive is one.
We say the drive is working if the stuff we put in comes out the same. Else it's corrupted. Likewise if the file we commit comes out the same on checkout the repo is fine else (something) is borked. The key question is
It's non-trivial because we implicitly apply different standards of "sameness" in different contexts!
...are different
A text file consists of a sequence of «printable characters» — let's leave the printable char notion unspecified other than to say no cr no lf!
How these lines are separated (or terminated) is again unspecified
Symbolically:
type Line = [Char]
type File = [Line]
Expanding on the 1st unspecified gives us ASCII, Latins, Unicode etc etc... Not relevant to this question
Expanding on the 2nd is what distinguishes windows *nix etc. JFTR this kind of file may be little known by the younger generation but also exists. And is particularly useful to remember that the notion "sequence of lines" can be imposed at many different levels.
We don't care how the sameness respects the unspecified parts
To return to our
When I copy foo.txt from Windows to Linux I expect the contents to be invariant. However I'm quite satisfied if H:foo.txt
changes to /media/name/Transcend/foo.txt
. In fact it would be more than a bit annoying if the windowsisms came through untranslated or vice versa.
Far-fetched?? ¡¡Think again!!
IOW thanks to splendid folks like Theodore T'so we take it for granted that Linux can read a windows file (system). This happens because a non-trivial amt of
happens under the hood.
We therefore expect that a file checked in to git is the same that's checked out... at a different time... And OS!
The catch is that the notion of same is sufficiently non-trivial that git needs some help from us in achieving that "sameness" to our satisfaction... That help is called .gitattributes!
autocrlf
is widely considered to be broken. The modern way to handle line endings is with .gitattributes
. GitHub has a great tutorial about how to use it here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With