I had a issue with Linux file reading under Window. Here is the issue discussion: Using fstream::seekg under windows on a file created under Unix.
The issue was workarounded by opening the text file with std::ios_base::binary
specified.
But what's the actual point with this mode? If specified, you can still work with your file as a text file (writting with mystream << "Hello World" << std::endl
and reading with std::getline
).
Under Windows, the only difference, I could notice is that mystream << "Hello World" << std::endl
uses:
0x0D 0x0A
as line separator if std::ios_base::binary
was not specified (EOL and carriage return)0x0A
as line separator if std::ios_base::binary
was specified (EOL only)Notepad does not smartly show lines when opening the files generated with std::ios_base::binary
. Better editors like vi or Wordpad does show them.
Is that really the only difference there is between files generated with and without std::ios_base::binary
? Documentation says Consider stream as binary rather than text.
, what does this mean in the end?
Is it safe to always set std::ios_base::binary
if I don't care about opeing the file in Notepad and want to have fstream::seekg
always work?
The class ios_base is a multipurpose class that serves as the base class for all I/O stream classes. It maintains several kinds of data: 1) state information: stream status flags. 2) control information: flags that control formatting of both input and output sequences and the imbued locale.
The differences between binary and text modes are implementation
defined, but only concern the lowest level: they do not change the
meaning of things like <<
and >>
(which insert and extract textual
data). Also, formally, outputting all but a few non-printable
characters (like '\n'
) is undefined behavior if the file is in text
mode.
For the most common OSs: under Unix, there is no distinction; both are
identical. Under Windows, '\n'
internally will be mapped to the two
character sequence CR, LF (0x0D, 0x0A) externally, and 0x1A will be
interpreted as an end of file when reading. In more exotic (and mostly
extinct) OSs, however, they could be represented by entirely different
file types at the OS level, and it could be impossible to read a file in
text mode if it were written in binary mode, and vice versa. Or you
could see something different: extra white space at the end of line, or
no '\n'
in binary mode.
With regards to always setting std::ios_base::binary
: my policy for
portable files is to decide exactly how I want them formatted, set
binary, and output what I want. Which is often CR, LF, rather than just
LF, since that's the network standard. On the other hand, most
Windows programs have no problems with just LF, but I've encountered
more than a few Unix programs which have problems with CR, LF; which
argues for systematically using just LF (which is easier, too). Doing
things this way means that I get the same results regardless of whether
I'm running under Unix or under Windows.
I found (by loosing two hour of work trying to understand what was going on) a situation where specifying std::ios_base::binary
does make a huge difference.
std::vector<char> data{ 0x01, 0x02, 0x0A, 0x0B };
{
std::fstream tfat;
tfat.open( "binary", std::ios_base::out | std::ios_base::binary );
tfat.write( &(data[0]), data.size() );
tfat.close();
}
{
std::fstream tfat;
tfat.open( "not_binary", std::ios_base::out );
tfat.write( &(data[0]), data.size() );
tfat.close();
}
Then, "binary" file contains 4 bytes: 0x01, 0x02, 0x0A, 0x0B
But "not_binary" file contains 5 bytes: 0x01, 0x02, 0x0D, 0x0A, 0x0B
0x0D (\r
) was inserted before 0x0A (\n
). While I write 4 bytes, I expected to have 4 bytes in the file in the end.
So this make me realize why std::ios_base::binary
must be used when writting data to a file, even if not using <<
operator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With