Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there (have there been) any efforts to create a schema language for arbitrary binary formats?

Tags:

binary

XML has a lot of benefits. It's both machine and human readable, it has a standardized format and it is remarkably versatile.

It also has some disadvantages. It's verbose and not a very efficient means of transferring large amounts of data.

One of the most useful aspects of XML is the schema language. Using a schema you can generate source code in any modern programming language to read an xml format without the tedious process of hand coding that usually accompanies most other file formats.

This got me thinking about whether a schema language for arbitrary binary file formats exists and if not, would it be a worth while endeavor?

Just in case I've been unclear. I'm asking about a language whose purpose is to define byte offsets, field and record lengths, delimiters, etc. that could be parsed to generate code that would read a file format that conformed to that specification.

I doubt I'm the first to suggest such an idea so if you know of any projects or working groups that have or are currently pursuing this area I'd be grateful.

like image 994
Kenneth Cochran Avatar asked Jan 12 '11 15:01

Kenneth Cochran


People also ask

What does language schema mean?

A schema is a formal definition of the syntax of an XML-based language, that is, it defines a family of XML documents. A schema language is a formal language for expressing schemas. There exists a variety of schema languages, as we shall see later.

Which data formats are usually stored in binary files?

Binary Files These files store multiple types of data like image, video, and audio in the same file.

What is binary format in programming?

A binary format is a format in which file information is stored in the form of ones and zeros, or in some other binary (two-state) sequence. This type of format is often used for executable files and numeric information in computer programming and memory.


3 Answers

I know this is an old question, but in the last few years I feel that Kaitai Struct has emerged as one of the best arbitrary binary schema description options, the bonus that it generates parsing code is a huge bonus.

https://kaitai.io/

"develop parsers for binary structures"

like image 170
SashiOno Avatar answered Nov 03 '22 22:11

SashiOno


xtype is a new general-purpose binary data language I developed that also covers the typical usage of XML: https://github.com/bitagoras/xtype/ A similar format that should be mentioned here is UBJSON, an efficient binary format for JSON like structures: https://github.com/ubjson/universal-binary-json

like image 38
bitagoras Avatar answered Nov 03 '22 22:11

bitagoras


Yes, several people have tried to do this.

One such attempt is Binary Format Description. Another is Data Format Description Language. I'm not sure how practical either one really is, though.

like image 20
Raph Levien Avatar answered Nov 03 '22 22:11

Raph Levien