Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git: autocrlf=true core.eol=native vs autocrlf=input

Can somebody explain what is the difference between settings:

core.autocrlf = true

core.eol = native

and

core.autocrlf = input

When we use both cases?

like image 972
Adrian Kalinowski Avatar asked Jan 22 '18 10:01

Adrian Kalinowski


People also ask

What is core Autocrlf input?

core.autocrlf = inputWhen you read files back out of the object database and write them into the working directory they will still have LFs to denote the end of line. This setting is generally used on Unix/Linux/OS X to prevent CRLFs from getting written into the repository.

What is core Autocrlf true?

autocrlf . This is a similar approach to the attributes mechanism: the idea is that a Windows user will set a Git configuration option core. autocrlf=true and their line endings will be converted to Unix style line endings when they add files to the repository.

What is Git config global core Autocrlf input?

The git config core. autocrlf command is used to change how Git handles line endings. It takes a single argument. On Windows, you simply pass true to the configuration.

Does Git use LF or CRLF?

Verifying Line Endings in Git for Any File Should be lf for text files. w : line endings in your working tree. May be either lf or crlf for text files.


2 Answers

When [should] we use [either of these settings]?

My preference is never. On the other hand, I also don't work on Windows. :-) However, if you read through all of the text below, you will see that if I did, I'd still say "never". (Even if you are sharing some upstream repository in which you're not allowed to create a .gitattributes file, you can use the per-repository $GIT_DIR/info/attributes file in your own clone.)

[What's the difference?]

To get to the difference, we need to first note:

  • What are conversions? What conversions can Git do?
  • When can Git do conversions? When will Git do conversions?
  • How does Git decide that a file is "text"?

Conversions, input and output: cleaning and smudging

The first part is pretty straightforward, although it presents its own stumbling block for newbies. Git can do any conversions you want, because it has what Git calls clean filters and smudge filters.

A clean filter is a conversion you—yes, you!—can write for yourself, that Git will apply when you copy a file from the work-tree to the index using git add or equivalent. That is, suppose you have a file checked out into your work-tree, and you edit it, or replace it entirely, or run some program over it that makes changes to it. You probably want to commit the new version of that file. So you must run git add path to copy the file from the work-tree, back into the index. (Remember that Git makes new commits from whatever's in the index, not from what's in the work-tree. This is why you keep having to git add your files all the time: Git doesn't automatically copy from work-tree to index.)

Whenever you run git add file, Git will "clean" the file on the way in. That's an input conversion.

Conversely, you can write your own smudge filter, which is something you do to file data when it comes out of Git (out of the index, into your work-tree). Since all files inside Git, including those in the index where they're ready to be copied into the next commit, are in some special, internal, Git-only format, every file must be converted into the normal format that all your regular computer programs can deal with. Whenever you extract the file to the work-tree, Git will "smudge" (dirty up) the file on the way out. That's an output conversion.

Git will occasionally do an input conversion without actually copying the file into the index: in particular, if you run a git diff that has to compare a work-tree file to an index or committed copy of the same file, the one that's inside the repository has already been "cleaned", while the one that's in the work-tree is all "smudged" and "dirty". They can't be compared until they are both in the same state, so Git will "clean up" the work-tree one before doing the diff.

The built-in line-ending conversions

Git has two built-in conversions. One is meant to be used when cleaning, i.e., when files get copied from the work-tree into Git (into the index). This one replaces CRLF line endings with newline-only, Linux-style, line endings. The other is meant to be used when smudging, i.e., when copying files out of Git. This one replaces newline-only Linux-style line endings with ... well, something.

This "something" is where core.eol comes in. You can have Git replace newlines with CRLF, which you might want if you're on Windows and you have programs that demand that lines end with CRLF, but you're also working with people who work on Linux which demands that lines end with LF-only newline style endings.

Or, you can have Git replace newlines with LF-only ... except that's not a replacement, because a newline is a line-feed "LF" character. It's a bit silly to call this a replacement.

You can have Git choose the ending based on your system, so that one configuration, with core.eol set to native, works on both Linux and Windows.

Git is a little sneaky: when it is going to "replace" LF with LF (which isn't a replacement after all) it tends to do nothing—not even to inspect anything—and hence go faster. It seems like you might never notice this, except that the core.safecrlf setting requires that Git inspect things. This safecrlf thing involves some guessing, and is meant to be overprotective and get you to set .gitattributes settings if you're doing conversions at all, so that you don't damage any binary files.

Binary files: how does Git decide that a file is text?

Some files, like .jpg images for instance, are simply not text and should never have any of their data modified in "text-ish" ways. They need to be manipulated with image-manipulation code, not with a text editor, or a clumsy tool line Git's built-in conversions. Git therefore needs a way to distinguish text files, which should get these line-ending conversions applied, from non-text or binary files.

If you don't tell Git which files are which, it is going to have to guess. The method Git uses to guess is not to look at the .jpg or .txt extension—this doesn't work on a file named README, for instance. Instead, Git looks at the data stored in the file, and guesses based on whether it "looks text-ish" or "looks binary".

As you can imagine, this guessing game is not perfect. It may work for you, but if it doesn't, you can and should tell Git which files are which. You do this by creating a file named .gitattributes. In .gitattributes, you can list particular file names like README, or path name patterns like *.txt and *.jpg, as being "definitely text" or "definitely binary". You do this with text or -text. You can also tell Git: guess! You do this with auto:

*.txt   text
*.jpg   -text
guess   auto

You should never use auto if you can help it.

You don't have to do this if you never have Git do any conversions. The point of telling Git which files are text and which are binary is to make sure that Git does the conversions correctly, and you only need to do this if you are doing conversions. So if you avoid Windows, you don't have to create a .gitattributes and list your files. It doesn't really hurt to create it anyway, but if you do create it, you should try to have it cover all your files, so that Git does not have to guess.

Now that we know all this, we can understand the documentation

With the above in mind, we can see what core.autocrlf does by consulting the git config documentation and scrolling down to the core.autocrlf description:

Setting this variable to "true" is the same as setting the text attribute to "auto" on all files and core.eol to "crlf". Set to true if you want to have CRLF line endings in your working directory and the repository has LF line endings. This variable can be set to input, in which case no output conversion is performed.

In other words, core.autocrlf=true is like using the auto setting on all files, which is something you should never do. So you should never use this. :-) It can work, but I would not recommend it: create a proper .gitattributes and list all your files there, so that you are not playing guessing games. Once your .gitattributes file lists everything, core.autocrlf=true has no effect, because the .gitattributes setting overrides it.

Using core.autocrlf=input tells Git to do the same guessing, but also to do only input conversions (during the git add cleaning for instance). I have no use for this setting myself, and cannot really imagine any situation in which it's a good idea. Such a situation might exist, but if you're going to do conversions at all, you should specify them explicitly; and once you have specified them correctly in your .gitattributes file, it seems to make more sense to do conversions in both directions, so there's no reason to use input.

As for setting core.eol to native, the documentation claims that this is the default (and it seems like the best choice), so there is no reason to bother, other than to override someone other configuration file's poor choice of a non-default setting.

like image 189
torek Avatar answered Oct 06 '22 01:10

torek


Both Github documentation and the git bible https://git-scm.com/book recommend these settings:

  • Mac/Linux: git config --global core.autocrlf input
  • Windows: git config --global core.autocrlf true

See:

  • Github recommendation
  • Git SCM book, paragraph "Formatting and Whitespace"

I have been working with these settings for 10+ years in mixed environments and never had a problem.

like image 28
Ernie Avatar answered Oct 06 '22 02:10

Ernie