Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File Binary vs Text

Tags:

c++

file

Are there some situation where I have to prefer binary file to text file? I'm using C++ as programming language?

For example if I have to store some large text file is it better use text file or binary file?

Edit

The file for the moment has no requirment to be readable from human. Are some performance difference, security difference and so on?

Edit

Sorry for the omit other the requirment (thanks to Carey Gregory)

  • The record to save are in ascii encoding
  • The file must be crypted ( AES )
  • The machine can power off any time. So I've to try to prevents errors.
  • I've to know if the file change outside the program, I think I'll use a sha1 digest of the file.
like image 612
Elvis Dukaj Avatar asked May 21 '13 12:05

Elvis Dukaj


People also ask

Which is better binary or text file?

Text files are more restrictive than binary files since they can only contain textual data. However, unlike binary files, they are less likely to become corrupted. While a small error in a binary file may make it unreadable, a small error in a text file may simply show up once the file has been opened.

Are text files binary files?

The two file types may look the same on the surface, but their internal structures are different. While both binary and text files contain data stored as a series of (bits (binary values of 1s and 0s), the bits in text files represent characters, while the bits in binary files represent custom data.

Is binary file faster than text file?

A binary file is usually very much smaller than a text file that contains an equivalent amount of data. For image, video, and audio data this is important. Small files save storage space, can be transmitted faster, and are processed faster. I/O with smaller files is faster, too, since there are fewer bytes to move.

Is binary more efficient than text?

Text protocols are better in terms of readability, ease of reimplementing, and ease of debugging. Binary protocols are more compact. However, you can compress your text using a library like LZO or Zlib, and this is almost as compact as binary (with very little performance hit for compression/decompression.)


1 Answers

As a general rule, define a text format, and use it. It's much easier to develop and debug, and it's much easier to see what is going wrong if it doesn't work.

If you find that the files are becoming too big, or taking to much time to transfer over the wire, consider compressing them. A compressed text file is often smaller than you can do with binary. Or consider a less verbose text format; it's possible to reliably transmit a text representation of your data with a lot less characters than XML uses.

And finally, if you do end up having to use binary, try to chose an existing format (e.g. Google's protocol blocks), or base your format on an existing format. Just remember that:

  • Binary is a lot more work than text, since you practically have to write all of the << operators again, including those in the standard library.

  • Binary is a lot more difficult to debug, because you can't easily see what you've actually done.

Concerning your last edit:

  • Once you've encrypted, the results will be binary. You can use a text representation of the binary (base64 or some such), but the results won't be any more readable than the binary, so it's not worth the bother. If you're encrypting in process, before writing to disk, you automatically lose all of the advantages of text.

  • The issues concerning powering off mean that you cannot use ofstream directly. You must open or create the file with the necessary options for full transactional integrity (O_SYNC as a flag to open under Unix). You must write each record as a single write request to the system.

  • It's always a good idea to have a checksum, just in case. If you're worried about security, SHA1 is a good choice. But keep in mind that if someone has access to the file, and wants to intentionally change it, they can recalculate the SHA1 and insert the new value as well.

like image 83
James Kanze Avatar answered Sep 24 '22 00:09

James Kanze