Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading proprietary file type [closed]

How does one develop a software to read a proprietary file type without having that proprietary software. Something like what the open office folks did with MS Word (.doc) files. Open Office can read .doc files.

That might be easy if the proprietary software has an open source SDK to it, for example Adobe has the Flex open source SDK so it's possible to create flash (.swf) files without having Adobe Flash. But in the case of MS Word, which I believe had no open source SDK, how did the open office guys get it to read it.

Of course I'm using open office just as an example, but my question is general, how could one read a proprietary output file? What's the idea here? I know someone will say reverse engineering, but I don't think reverse engineering the entire software makes sense here (not that I know anything about that field yet) because the goal is not to create software that has the same functionalities. Is there a way to work with the output file only?

Any thoughts on this?

like image 601
zoo Avatar asked Feb 13 '26 23:02

zoo


1 Answers

It's an iterative process:

  • Inspect the stream of raw bytes in the file and make a guess as to what they mean
  • Write code to verify the guess
  • See what goes wrong when you try to load the file
  • Repeat

You'll need a wide variety of test files, a lot of patience and large dollops of insight.

My experience is that it's pretty easy to handle the basics, but that complex file format features can be a pain to handle.

like image 107
Bevan Avatar answered Feb 17 '26 16:02

Bevan