I'm trying to remove all CLRF characters from a projects git repo. I'm writing a command to grep through the repo recursively to find instances. However, some of the 'hits' when opened in vim show very explicitly that there is ^M and yet others do not display these characters.
However, when running
file <filename without visual ^M>
It says
blah.java ASCII Java program text, with CRLF line terminators
and
od -cx <filename without visual ^M>
returns with \r\n peppered throughout.
I'm just interested why vim sometimes shows them and sometimes not.
EDIT:
I created a test text file and manually added ^M (ie ctrl V + ctrl M) and vim displayed those characters. Then I ran:
sed -i '' -e 's/\r//g' controlm.txt
And opened the file with vim and the visual ^M were gone, but od -cx still showed \r \n, however I then ran
sed -i '' -e 's/^M//g' controlm.txt
Then it removed not only the visual ^M in vim as well as I've confirmed that od -cx displays that \r \n are now just \n.
This question would probably better be asked on Superuser.com, not here, because it's about using vim, not programming. But to answer it:
When opening a file, vim tries to detect if it's a MS-DOS/Windows or a unix file. If all lines are terminated by \r\n
, it's probably a DOS file, if only some of them are, vim may assume unix as well. If the file format is set to DOS, vim ignores \r
when reading the file, and shows [dos] in the status line directly after reading the file.
When writing back the file, it terminates each line with \r\n
; if the file format is unix, it terminates lines with \n
. You can set the mode with the command
:se fileformat=unix
or
:se fileformat=dos
Try creating a file x.txt
in Windows, open it in vim. Then, :se fileformat=unix
and :w y.txt
; then :se fileformat=dos
and :w z.txt
. Test y.txt and z.txt with od cx
. y.txt
will have \r\n
line endings, z.txt
won't.
When only some, but not all, lines in the file end in \r
, for example if (unix) git added some headers (without \r
) to a file that was created on dos/windows, the file format detection sees the headers first, assumes unix, does not remove the \r
from the rest of the file when reading, and shows those as ^M
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With