Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you lay out your custom binary file format?

Every application has its own custom binary file format (e.g. .mpq, .wad). On top of that, its commonly zipped.

So, my question is, how do you artfully/skillfully layout the binary contents of your file. Do you have a "table of contents" like structure at the beginning? Is it better to dump everything in one file?

So say you have an array of Shapes, and in each Shape is deformed vertex data (so the vertex data has changed from the file it was originally loaded from, so it should be saved anew).

class Shape
{
    vector<Vertex> verts ;
} ;

class Sphere : public Shape { } ; // ...more geometric shapes (Tet, Cube) are defined..

class Model : public Shape { } ; // general model "Shape" loaded from file

vector<Shape*> shapes ; // save me!  contents are mix of Model, Sphere, Tet..
// each with variable number of verts
like image 796
bobobobo Avatar asked Oct 02 '11 21:10

bobobobo


People also ask

What are examples of binary format files?

Executable files, compiled programs, SAS and SPSS system files, spreadsheets, compressed files, and graphic (image) files are all examples of binary files.

How do you process a binary file?

You can choose one of two methods for loading the data. 1) Use the commands open file, read from file and close file. 2) Use the URL keyword with the put command, prefixing the file path with "binfile:". Either approach allows you to place binary data into a variable so that it can be processed.


2 Answers

My favorite article on the topic of file formats is at http://www.fadden.com/techmisc/file-formats.htm.

Beyond that, it probably comes down to what kind of data you are storing, and how that data will be used (will it be transmitted across a network, primarily? How important is seek access? Etc...).

Start with that article; it may help crystallize your thoughts if you already have a format that needs designing.

like image 182
BobS Avatar answered Oct 23 '22 23:10

BobS


In short - if your only need serialization, which means that you'll read and write from and to a stream, than you can use no-brainer here and emit your scructs member by member, or use any serialization library there is, from CArchive to .... whatever you see fancy.

If not, and you will have a need to directly access your data inside the file, then... you'll use your requirements and they will, with some skill, tell you what will be layout of the file you are having.

And yeah, to broad topic to dwell here. For example,

I have a need for a database of thumbnails for my software. Each thumbnail has a timestamp, and I know that they will be of a different size. Requirements are:

  • sequential write (thumbs will be appended to the end of the database)
  • thumbs will be appended in ascending order
  • direct read (given time, get thumbnail in o(1) )
  • no later modification to the database
  • thumbnails will be in 15 seconds interval

Yes, requirements ARE simple here, but they stand for themselves.

I created two files, one with indexes and other with pictures.

Storing: append data file with image, append index file with index of the image in the data file. Reading: find the index in the file using simple indexing ( index is (timestamp-timestamp_start)/15 ). Use that index to fetch image data.

like image 22
Daniel Mošmondor Avatar answered Oct 23 '22 23:10

Daniel Mošmondor