I'm looking into ways of formally specifying format for various binary streams and using a tool to check streams for compliance with specification. Something like XSD+any of validation tools for XML. Or like extremely complicate grep expression working on a binary level (preferably not - that would really be hard to read).
Does anybody know of a specification/tool that would be useful?
[Rationale: We are receiving many 3rd party generated binary files on a daily basis and many times they are using bad tools that produce invalid files. We want to give them a tool which they could use as a validator and we don't want to write a specific tool for each format.]
If you think Java's .class files documentation is a good example of a specification, reconsider looking at Preon. Preon is capturing it entirely, and generates documentation like this.
There are actually a couple of other initiatives for capturing the 'syntax' of binary encoded files. ASN.1 is useful, but it doesn't give you a lot of mileage if you intent to capture - say - Java class files. The same holds for BSDL, Flavor, BFlavor and a couple of other initiatives. Problem is: there are a million ways to encode binary data, lots of binary compression techniques, and I think that means there will never be something that captures it entirely, unless the language itself is extensible.
Google protocol buffers basically has the same problem. It defines something like Corba's CDR, and it's good, as long as you don't need something more advanced. Google protocol buffers is not going to allow you to capture Java's class file format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With