Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to force converting worktree files after changing core.autocrlf?

I'm on Windows and have systemwide core.autocrlf=true.

For a specific repository, I've overridden it locally to false.

But that didn't convert line endings in checked-out files. How do I do that?

  • If I convert the files manually with e.g. dos2unix, they show as altered.
  • Also tried git checkout --force HEAD, it had no effect.

The only working way I have found is to delete all the files, then git reset --hard which is rather awkward (=there's no simple and reliable command to do that, and it does lots of unnecessary work -- everything is recreated from scratch rather than just overwriting the files that need to be converted).

like image 973
ivan_pozdeev Avatar asked Dec 02 '18 14:12

ivan_pozdeev


People also ask

How do I change from CRLF to LF in Git?

text eol=crlf Git will always convert line endings to CRLF on checkout. You should use this for files that must keep CRLF endings, even on OSX or Linux. text eol=lf Git will always convert line endings to LF on checkout. You should use this for files that must keep LF endings, even on Windows.

What does core Autocrlf true do?

core. autocrlf = true This means that Git will process all text files and make sure that CRLF is replaced with LF when writing that file to the object database and turn all LF back into CRLF when writing out into the working directory.

What is Autocrlf input?

core.autocrlf = inputWhen you read files back out of the object database and write them into the working directory they will still have LFs to denote the end of line. This setting is generally used on Unix/Linux/OS X to prevent CRLFs from getting written into the repository.


1 Answers

TL;DR

These are three possible solutions (not necessarily the only three).

  1. Use:

    git add --renormalize .
    

(done in the top level of the repository, once). This requires a newer Git, but is the simplest method.

Note: it's not at all clear to me whether this affects the work-tree versions; you might still need git checkout -- . to re-copy from index to work-tree.

  1. For each file that git status is complaining about: rm file; git checkout -- file. The rm removes the work-tree copy so that git checkout must actually re-extract the file according to the new line-ending rules.

You can simplify this somewhat with git rm -r .; git checkout HEAD -- . (just two commands) but this has the side effect of touching all the files in the work-tree, even any files with no changes needed (files that have no carriage-returns in them).

  1. Use dos2unix as you have been, then run git add on the files (or on .). Despite appearances, this should leave the index unchanged.

In all cases, afterward, git status should say nothing to commit, working tree clean.

Long

This is not quite a duplicate of Git: how to renormalize line endings in all files in all revisions?, as you don't want to re-copy a bunch of existing commits. However, the git add --renormalize answer there should work.

Or, if that fails or if your Git is too old to have the --renormalize option:

If I convert the files manually with e.g. dos2unix, they show as altered.

You can convert the files manually, then git add ., or remove the work-tree copies and git checkout them again. The git checkout --force HEAD failed because Git was too smart for its own good: it saw (incorrectly) that the work-tree copy was already correct and avoided doing work on it.

What's going on here

There are, at all times, three active copies of each file. Let's say you have a README.txt and a prog.cc, both of which have CRLF endings in your work-tree, but LF-only line endings in the repository.

   HEAD          index       work-tree
----------    ----------    ----------
README.txt    README.txt    README.txt
prog.cc       prog.cc       prog.cc

The copy in the commit is sacrosanct, inviolable, frozen forever (or as long as that commit exists) in whatever form it has there. (I'm assuming for now that each of these files has LF-style line endings.) It's compressed, too.

The copy in the index is writable, but initially matches the copy in the commit. So it will also have LF-only line endings too. It's compressed, too (it's actually just a reference to the committed copy, at first).

The copy in the work-tree is uncompressed and has the line endings you told Git to use through your .gitattributes file (none) and your core.autocrlf and core.eol and so on. You had them set to change LF to CRLF, so the copies in your work-tree have CRLF endings at the moment.

Now—after the checkout—you change your settings, so that files that get checked-out will have LF-only line endings, or will preserve what's in the index. Unfortunately, one of the entries in each index copy of the file is information about the work-tree copy. This makes Git assume that the work-tree copy is the same as the index copy.

Clearly, since the work-tree copy has CRLF endings while the index copy has LF-only endings, the two are different. But if you had not changed your end-of-line settings, git status is required to say otherwise, so it has to make this assumption.

If you hadn't changed the EOL settings, git status would say nothing and this would bother no one, because if you ran git add on, say, README.txt, that would copy the work-tree copy back into the index. Along the way this would turn CRLF line endings into LF-only line endings, and re-compress the file. The resulting file would match the HEAD copy, and git status would have to say nothing.

But you did change the EOL settings, so if you ran git add now, Git should copy the CRLF ending into the index. Essentially, git status has been fooled: the index says—on purpose!—that the work-tree copy matches (even though it doesn't), and running git add while the work-tree copy has CRLF line endings would change the index copy.

If you use dos2unix on the file to change the work-tree copy, Git now sees that the work-tree copy's statistics don't match the index's saved "this file is clean" statistics. That is, git status remains fooled but now says that the work-tree copy is different! If you git add the file now, Git will keep the LF-only line endings while updating the index copy. The end result will be that the index copy matches the HEAD copy after all, and that Git updates the cached work-tree statistics about the file so that it knows that the index copy matches the work-tree copy.

Essentially, after changing line-ending settings—in .gitattributes and/or core.* variables—you must have Git fix the index's "clean/dirty" cache data. Until git add --renormalize the only way to do that was to force Git to copy from index to work-tree:

rm worktreefile
git checkout -- worktreefile

or force Git to copy from work-tree to index:

git add worktreefile

both of which fix up the index's cache data, but obviously do a bit of additional violence in the process.

Note that if the committed HEAD copy has CRLF endings, things change

Suppose that the committed copy of README.txt has CRLF endings. Then, initially:

  • the index copy matches the HEAD copy as usual, so it has CRLF endings;
  • with CRLF endings in the work-tree, all three copies match;
  • but if you select LF-only endings in the work-tree, and make that happen, the work-tree copy differs from both HEAD and index.

This is true regardless of whether git status is fooled.

Once you copy the work-tree's LF-only line endings into the index such that the index also has LF-only line endings, now the index copy ("staged for commit") differs from the HEAD copy. At this point, if you make a new commit, that commit will have LF-only line endings, and you'll be in the state we described earlier.

like image 189
torek Avatar answered Oct 17 '22 06:10

torek