Are there some situation where I have to prefer binary file to text file? I'm using C++ as programming language?
For example if I have to store some large text file is it better use text file or binary file?
Edit
The file for the moment has no requirment to be readable from human. Are some performance difference, security difference and so on?
Edit
Sorry for the omit other the requirment (thanks to Carey Gregory)
Text files are more restrictive than binary files since they can only contain textual data. However, unlike binary files, they are less likely to become corrupted. While a small error in a binary file may make it unreadable, a small error in a text file may simply show up once the file has been opened.
The two file types may look the same on the surface, but their internal structures are different. While both binary and text files contain data stored as a series of (bits (binary values of 1s and 0s), the bits in text files represent characters, while the bits in binary files represent custom data.
A binary file is usually very much smaller than a text file that contains an equivalent amount of data. For image, video, and audio data this is important. Small files save storage space, can be transmitted faster, and are processed faster. I/O with smaller files is faster, too, since there are fewer bytes to move.
Text protocols are better in terms of readability, ease of reimplementing, and ease of debugging. Binary protocols are more compact. However, you can compress your text using a library like LZO or Zlib, and this is almost as compact as binary (with very little performance hit for compression/decompression.)
As a general rule, define a text format, and use it. It's much easier to develop and debug, and it's much easier to see what is going wrong if it doesn't work.
If you find that the files are becoming too big, or taking to much time to transfer over the wire, consider compressing them. A compressed text file is often smaller than you can do with binary. Or consider a less verbose text format; it's possible to reliably transmit a text representation of your data with a lot less characters than XML uses.
And finally, if you do end up having to use binary, try to chose an existing format (e.g. Google's protocol blocks), or base your format on an existing format. Just remember that:
Binary is a lot more work than text, since you practically
have to write all of the <<
operators again, including those
in the standard library.
Binary is a lot more difficult to debug, because you can't easily see what you've actually done.
Concerning your last edit:
Once you've encrypted, the results will be binary. You can use a text representation of the binary (base64 or some such), but the results won't be any more readable than the binary, so it's not worth the bother. If you're encrypting in process, before writing to disk, you automatically lose all of the advantages of text.
The issues concerning powering off mean that you cannot use
ofstream
directly. You must open or create the file with the
necessary options for full transactional integrity (O_SYNC
as
a flag to open
under Unix). You must write each record as
a single write
request to the system.
It's always a good idea to have a checksum, just in case. If you're worried about security, SHA1 is a good choice. But keep in mind that if someone has access to the file, and wants to intentionally change it, they can recalculate the SHA1 and insert the new value as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With