ASCII compressor works for short test file, not on long

Tags:

The current project in Systems Programming is to come up with an ASCII compressor that removes the top zero bit and writes the contents to the file.

In order to facilitate decompression, the original file size is written to file, then the compressed char bytes. There are two files to run tests on- one that is 63 bytes long, and the other is 5344213 bytes. My code below works as expected for the first test file, as it writes 56 bytes of compressed text plus 4 bytes of file header.

However, when I try it on the long test file, the compressed version is 3 bytes shorter than the original, when it should be roughly 749KiB smaller, or 14% of original size. I've worked out the binary bit shift values for the first two write loops of the long test file, and they match up what is being recorded on my test printout.

while ( (characters= read(openReadFile, unpacked, BUFFER)) >0 ){
   unsigned char packed[7]; //compression storage
   int i, j, k, writeCount, endLength, endLoop;

    //loop through the buffer array
    for (i=0; i< characters-1; i++){
        j= i%7; 

        //fill up the compressed array
        packed[j]= packer(unpacked[i], unpacked[i+1], j);

        if (j == 6){
            writeCalls++; //track how many calls made

            writeCount= write(openWriteFile, packed, sizeof (packed));
            int packedSize= writeCount;
            for (k=0; k<7 && writeCalls < 10; k++)
                printf("%X ", (int)packed[k]);      

            totalWrittenBytes+= packedSize;
            printf(" %d\n", packedSize);
            memset(&packed[0], 0, sizeof(packed)); //clear array

            if (writeCount < 0)
                printOpenErrors(writeCount);
        }
        //end of buffer array loop
        endLength= characters-i;
        if (endLength < 7){

            for (endLoop=0; endLoop < endLength-1; endLoop++){
                packed[endLoop]= packer(unpacked[endLoop], unpacked[endLoop+1], endLoop);
            }

            packed[endLength]= calcEndBits(endLength, unpacked[endLength]);
        }
    } //end buffer array loop
} //end file read loop

The packer function:

//calculates the compressed byte value for the array
char packer(char i, char j, int k){
    char packStyle;

    switch(k){
        //shift bits based on mod value with 8
        case 0:
                packStyle= ((i & 0x7F) << 1) | ((j & 0x40) >> 6);
            break;
        case 1:
            packStyle= ((i & 0x3F) << 2) | ((j & 0x60) >> 5);
            break;
        case 2:
            packStyle= ((i & 0x1F) << 3) | ((j & 0x70) >> 4);
            break;
        case 3:
            packStyle= ((i & 0x0F) << 4) | ((j & 0x78) >> 3);
            break;
        case 4:
            packStyle= ((i & 0x07) << 5) | ((j & 0x7C) >> 2);
            break;
        case 5:
            packStyle= ((i & 0x03) << 6) | ((j & 0x7E) >> 1);
            break;
        case 6:
            packStyle= ( (i & 0x01 << 7) | (j & 0x7F));
            break;
    }

    return packStyle;
}

I've verified that there are 7 bytes written out every time the packed buffer is flushed, and there are 763458 write calls made for the long file, which match up to 5344206 bytes written.

I'm getting the same hex codes from the printout that I worked out in binary beforehand, and I can see the top bit of every byte removed. So why aren't the bit shifts being reflected in the results?

703

asked Mar 05 '11 04:03

Jason

1 Answers

Ok, since this is homework I'll just give you a few hints without giving out a solution.

First are you sure that the 56 bytes you get on the first file are the right bytes? Sure the count looks good, but you got lucky on count (proof is the second test file). I can immediately see at least two key mistakes in the code.

To make sure you have the right output, the byte count is not enough. You need to dig deeper. How about checking the bytes themselves one by one. 63 characters is not that much to go heh? There are many ways you can do this. You could use od (a pretty good Linux/Unix tool to look at the binary contents of files, if you're on Windows use some Hex editor). Or you could print out debug information from within your program.

Good luck.

147

answered Oct 24 '22 18:10

asoundmove

Related questions
                            
                                GCC optimization flag -O2 makes code much slower that -O0 [duplicate]
                            
                                How can I see memory leaks on MacOS Big Sur using CLion?
                            
                                Problem using AddIPAddress when impersonating an Admin User
                            
                                Why is windows select() not always notifying thread B's select() when thread A closes its end of a socket pair?
                            
                                How to convert an integer to a string portably?
                            
                                Qt Program deploy to multi platform, how?
                            
                                Directory layout for a Python project with C extension modules
                            
                                SOAP library in C
                            
                                C code to generate and send a packet [closed]
                            
                                Unexpected segfault in memcmp
                            
                                Android ADB API from C/C++ Applications
                            
                                C# .net wrapper for c dll, specifically lglcd (g19 sdk)
                            
                                How to transfer a C structure to java by use of JNI?
                            
                                Preprocessor macro GCC: pasting x and x does not give a valid preprocessing token
                            
                                How to eliminate a redundant macro parameter
                            
                                Accessing Apple Earbud Clicker Controls in C

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ASCII compressor works for short test file, not on long

Tags:

c

bit-manipulation

compression

Jason

People also ask

1 Answers

asoundmove

Recent Activity

Donate For Us