Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why CR LF is changed to LF in Windows?

In Windows when you read characters \r\n from the file(or stdin) in text mode, \r gets deleded and you only read \n.

Is there a standard according to which it should be so?

Could I be sure that it will be true for any compiler on Windows? Will others platform-specifics character combinations will replaced by \n on those platforms too?

I use this code to generate the input and use this code to read it. The results are here. You may note few missed \r's

like image 260
RiaD Avatar asked Jun 28 '13 18:06

RiaD


1 Answers

Yes, this comes from compatibility with C. In C text streams, lines are terminated by a newline character. This is the internal representation of the text stream as seen by the program. The I/O library converts between the internal representation and some external one.

The internal representation is platform-independent, whereas there are different platform-specific conventions for text. That's the point of having a text mode in the stream library; portable text manipulating programs can be written which do not have to contain a pile of #ifdef directives to work on different platforms, or build their own platform-independent text abstraction.

It so happens that the internal representation for C text streams matches the native Unix representation of text files, since the C language and its library originated on Unix. For portability of C programs to other platforms, the text stream abstraction was added which makes text files on non-Unix system look like Unix text files.

In the ISO/IEC 9899:1999 standard ("C99"), we have this:

7.19.2 Streams

[...]

A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined. Characters may have to be added, altered, or deleted on input and output to conform to differing conventions for representing text in the host environment. Thus, there need not be a one-to-one correspondence between the characters in a stream and those in the external representation.

Bold emphasis mine. C++ streams are defined in terms of C streams. There is no explanation of text versus binary mode in the C++ standard, except for a table which maps various stream mode flag combinations to strings suitable as mode arguments to fopen.

like image 160
Kaz Avatar answered Oct 04 '22 23:10

Kaz