Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Decoding a file compressed with an obsolete language

I'm trying to decompress a data file that was originally compressed with an extension for AMOS Pro, the old Amiga BASIC language, that shipped with the AMOS Pro compiler. I've still got the programming language and have access to the compressor and decompressor, but I'm trying to decompress the files using C. I ultimately want to be able to view these files on modern hardware without having to resort to using an Amiga emulator first.

However, there's no documentation as to how the compressor worked, so I'm trying to reverse-engineer it solely from watching its behaviour. Here's what I've got so far.

This is a raw file (ASCII):

AABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZ

Here's the compressed version (hex):

D802C6B5
05048584
4544C5C4
2524A5A4
6564E5E4
15149594
5554D5D4
3534B591
00000007
AD763363
00000051

Testing with various files has given me to a few insights:

  • The last 4 bytes are the size of the original file.
  • The file seems to function as a bit stream, so byte boundaries aren't important (I say this because I've seen ASCII codes appear in a few files and they aren't aligned to byte boundaries).
  • All of the bits in the file are stored in reverse.

The first 4 byte seems to represent a sequence length. In the above example, the value 0xD8 is 11011000 in binary; mirror it (bits are in reverse) and you'll get 00011011, which is 0x1B in hex or 27 in decimal. That matches the sequence length.

However, I'm not making any more progress. Does this look like a standard compression algorithm? What do I try next?

like image 225
Ant Avatar asked Feb 08 '14 19:02

Ant


1 Answers

As you've posted here, the compression function is called "squash", a function part of AMOS Pro.

As such, my advice would be to try one of the following lines of attack:

  • Reverse engineer the algorithm by analyzing its output: This is definitely not a viable option. You will only waste time.
  • Read, annotate, understand the source code of the unsquash function in AMOS Pro
  • Contact the author of AMOS Pro

Read the source code

The source code for AMOS Pro is apparently in the public domain now and can be found here:

http://www.pianetaamiga.it/downloads/AMOSPro_Sources.zip

It consists of 68000 assembly code and quite a few compiled object files.

The unsquash function can be found in the file +header.s on line 1061 and onwards. It is not documented, except for its entry register values, which is good at least. It doesn't appear to be a very large function so this might be worth a shot.

You will need to have, or obtain/learn, rudimentary 68000 machine code. It does not appear to call out to system libraries or anything and only seem to operate directly on memory, which would suggest this is actually doable (ie. understanding the code). Still, I've never written or read 68000 code in my life so what do I know.

Contact the author of AMOS Pro

The author of AMOS Pro is François Lionet, as is evident by the User Guide, he founded Clickteam in the mid-90s to make game- and multimedia-making software. He still seems to be situated in that company and according to forum posts from others looking into AMOS Pro he seems to be willing to answer email. Sadly I don't know his email but the Clickteam website above should give you a starting point.

like image 161
Lasse V. Karlsen Avatar answered Sep 22 '22 16:09

Lasse V. Karlsen