Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing a binary file. What is a modern way?

I have a binary file with some layout I know. For example let format be like this:

  • 2 bytes (unsigned short) - length of a string
  • 5 bytes (5 x chars) - the string - some id name
  • 4 bytes (unsigned int) - a stride
  • 24 bytes (6 x float - 2 strides of 3 floats each) - float data

The file should look like (I added spaces for readability):

5 hello 3 0.0 0.1 0.2 -0.3 -0.4 -0.5 

Here 5 - is 2 bytes: 0x05 0x00. "hello" - 5 bytes and so on.

Now I want to read this file. Currently I do it so:

  • load file to ifstream
  • read this stream to char buffer[2]
  • cast it to unsigned short: unsigned short len{ *((unsigned short*)buffer) };. Now I have length of a string.
  • read a stream to vector<char> and create a std::string from this vector. Now I have string id.
  • the same way read next 4 bytes and cast them to unsigned int. Now I have a stride.
  • while not end of file read floats the same way - create a char bufferFloat[4] and cast *((float*)bufferFloat) for every float.

This works, but for me it looks ugly. Can I read directly to unsigned short or float or string etc. without char [x] creating? If no, what is the way to cast correctly (I read that style I'm using - is an old style)?

P.S.: while I wrote a question, the more clearer explanation raised in my head - how to cast arbitrary number of bytes from arbitrary position in char [x]?

Update: I forgot to mention explicitly that string and float data length is not known at compile time and is variable.

like image 305
nikitablack Avatar asked Nov 10 '14 14:11

nikitablack


People also ask

What is binary parsing?

The binary parser is driven by a json data structure called a “Profile”. A Profile is simply a data driven description of how structs are laid out and how to parse them. In order to use the parser, one simply provides a profile definition, and a file (or data blob) to parse.

Which method is used to read data from a binary file?

The BinaryReader class is used to read binary data from a file. A BinaryReader object is created by passing a FileStream object to its constructor.

What are examples of binary format files?

Executable files, compiled programs, SAS and SPSS system files, spreadsheets, compressed files, and graphic (image) files are all examples of binary files.

What is a binary file and how is it used?

A binary file is a type of computer file that is used to store binary data. It may contain any type of formatted or unformatted data encoded within binary format. It is used directly by the computer and generally can't be read by a human. Binary files may also be called binaries.


1 Answers

If it is not for learning purpose, and if you have freedom in choosing the binary format you'd better consider using something like protobuf which will handle the serialization for you and allow to interoperate with other platforms and languages.

If you cannot use a third party API, you may look at QDataStream for inspiration

  • Documentation
  • Source code
like image 131
fjardon Avatar answered Sep 22 '22 03:09

fjardon