Reading proprietary file type [closed]

Question

How does one develop a software to read a proprietary file type without having that proprietary software. Something like what the open office folks did with MS Word (.doc) files. Open Office can read .doc files.

That might be easy if the proprietary software has an open source SDK to it, for example Adobe has the Flex open source SDK so it's possible to create flash (.swf) files without having Adobe Flash. But in the case of MS Word, which I believe had no open source SDK, how did the open office guys get it to read it.

Of course I'm using open office just as an example, but my question is general, how could one read a proprietary output file? What's the idea here? I know someone will say reverse engineering, but I don't think reverse engineering the entire software makes sense here (not that I know anything about that field yet) because the goal is not to create software that has the same functionalities. Is there a way to work with the output file only?

Any thoughts on this?

Bevan · Accepted Answer

It's an iterative process:

Inspect the stream of raw bytes in the file and make a guess as to what they mean
Write code to verify the guess
See what goes wrong when you try to load the file
Repeat

You'll need a wide variety of test files, a lot of patience and large dollops of insight.

My experience is that it's pretty easy to handle the basics, but that complex file format features can be a pain to handle.

Reading proprietary file type [closed]

Tags:

reverse-engineering

open-source

file-format

zoo

1 Answers

Bevan

Recent Activity

Donate For Us

Reading proprietary file type [closed]

Tags:

reverse-engineering

open-source

file-format

zoo

1 Answers

Bevan

Related questions

Recent Activity

Donate For Us