Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between opening a file in binary vs text [duplicate]

I've done some stuff like:

FILE* a = fopen("a.txt", "w");
const char* data = "abc123";
fwrite(data, 6, 1, a);
fclose(a);

and then in the generated text file, it says "abc123" just like expected. But then I do:

//this time it is "wb" not just "w"
FILE* a = fopen("a.txt", "wb");
const char* data = "abc123";
fwrite(data, 6, 1, a);
fclose(a);

and get the exact same result. If I read the file using binary or normal mode, it also gives me the same result. So my question is, what is the difference between fopening with or without binary mode.

Where I read about fopen modes: http://www.cplusplus.com/reference/cstdio/fopen/

like image 747
BWG Avatar asked Dec 31 '13 22:12

BWG


People also ask

What is the difference between a binary and a text file?

Text files are organized around lines, each of which ends with a newline character ('\n'). The source code files are themselves text files. A binary file is the one in which data is stored in the file in the same way as it is stored in the main memory for processing.

What are the three main differences in opening a file in text mode and binary mode?

When we try to read or write files in our program, usually there are two modes to use. Text mode, usually by default, and binary mode. Obviously, in text mode, the program writes data to file as text characters, and in binary mode, the program writes data to files as 0/1 bits.

What is the primary difference between files opened in binary mode rather than text mode?

The major difference between these two is that a text file contains textual information in the form of alphabets, digits and special characters or symbols. On the other hand, a binary file contains bytes or a compiled version of a text file.

What will happen if you try to open a binary file using a text?

Binary and text data aren't separated: They are simply data. It depends on the interpretation that makes them one or the other. If you open binary data (such as an image file) in a text editor, much of it won't make sense, because it does not fit your chosen interpretation (as text).


2 Answers

The link you gave does actually describe the differences, but it's buried at the bottom of the page:

http://www.cplusplus.com/reference/cstdio/fopen/

Text files are files containing sequences of lines of text. Depending on the environment where the application runs, some special character conversion may occur in input/output operations in text mode to adapt them to a system-specific text file format. Although on some environments no conversions occur and both text files and binary files are treated the same way, using the appropriate mode improves portability.

The conversion could be to normalize \r\n to \n (or vice-versa), or maybe ignoring characters beyond 0x7F (a-la 'text mode' in FTP). Personally I'd open everything in binary-mode and use a good Unicode or other text-encoding library for dealing with text.

like image 67
Dai Avatar answered Sep 21 '22 14:09

Dai


The most important difference to be aware of is that with a stream opened in text mode you get newline translation on non-*nix systems (it's also used for network communications, but this isn't supported by the standard library). In *nix newline is just ASCII linefeed, \n, both for internal and external representation of text. In Windows the external representation often uses a carriage return + linefeed pair, "CRLF" (ASCII codes 13 and 10), which is converted to a single \n on input, and conversely on output.


From the C99 standard (the N869 draft document), §7.19.2/2,

A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined. Characters may have to be added, altered, or deleted on input and output to conform to differing conventions for representing text in the host environment. Thus, there need not be a one- to-one correspondence between the characters in a stream and those in the external representation. Data read in from a text stream will necessarily compare equal to the data that were earlier written out to that stream only if: the data consist only of printing characters and the control characters horizontal tab and new-line; no new-line character is immediately preceded by space characters; and the last character is a new-line character. Whether space characters that are written out immediately before a new-line character appear when read in is implementation-defined.

And in §7.19.3/2

Binary files are not truncated, except as defined in 7.19.5.3. Whether a write on a text stream causes the associated file to be truncated beyond that point is implementation- defined.

About use of fseek, in §7.19.9.2/4:

For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.

About use of ftell, in §17.19.9.4:

The ftell function obtains the current value of the file position indicator for the stream pointed to by stream. For a binary stream, the value is the number of characters from the beginning of the file. For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call; the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read.

I think that’s the most important, but there are some more details.

like image 31
Cheers and hth. - Alf Avatar answered Sep 22 '22 14:09

Cheers and hth. - Alf