Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Structure validation for binary files

I'm looking into ways of formally specifying format for various binary streams and using a tool to check streams for compliance with specification. Something like XSD+any of validation tools for XML. Or like extremely complicate grep expression working on a binary level (preferably not - that would really be hard to read).

Does anybody know of a specification/tool that would be useful?

[Rationale: We are receiving many 3rd party generated binary files on a daily basis and many times they are using bad tools that produce invalid files. We want to give them a tool which they could use as a validator and we don't want to write a specific tool for each format.]

like image 620
gabr Avatar asked Mar 01 '23 15:03

gabr


1 Answers

If you think Java's .class files documentation is a good example of a specification, reconsider looking at Preon. Preon is capturing it entirely, and generates documentation like this.

There are actually a couple of other initiatives for capturing the 'syntax' of binary encoded files. ASN.1 is useful, but it doesn't give you a lot of mileage if you intent to capture - say - Java class files. The same holds for BSDL, Flavor, BFlavor and a couple of other initiatives. Problem is: there are a million ways to encode binary data, lots of binary compression techniques, and I think that means there will never be something that captures it entirely, unless the language itself is extensible.

Google protocol buffers basically has the same problem. It defines something like Corba's CDR, and it's good, as long as you don't need something more advanced. Google protocol buffers is not going to allow you to capture Java's class file format.

like image 168
Wilfred Springer Avatar answered Mar 07 '23 09:03

Wilfred Springer