Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Decompressing LZMA using C-LZMA-SDK in C++ returns SZ_ERROR_DATA because the first Byte of the input stream is != 0

Tags:

c++

I have a file, that is according to its owner LZMA compressed. lzmadecode.exe (the program) has no problems to decode it, so the file is not corrupt and seems indeed to be LZMA encoded.

Here is the code where I read the file to a buffer and call the UnCompress function:

int main() 
{
    ::std::ifstream lReplayFileStream("C:\\tmp\\COMPRESSED_FILE", ::std::ios::binary);
    if (lReplayFileStream)
    {
        lReplayFileStream.seekg(0, lReplayFileStream.end);
        std::streamoff lFileSize = lReplayFileStream.tellg();
        lReplayFileStream.seekg(0, lReplayFileStream.beg);

        char * lReplayBuffer = new char[lFileSize];
        lReplayFileStream.read(lReplayBuffer, lFileSize);

        if (lReplayFileStream.gcount() != lFileSize)
        {
            // Error    
        }
        lReplayFileStream.close();

        ::std::vector<unsigned char> inBuf(lFileSize);
        ::std::vector<unsigned char> outBuf;

        memcpy(&inBuf[0], lReplayBuffer, lFileSize);

        UNCOMPRESSED_SIZE = lFileSize + lFileSize * 3;

        UnCompress(outBuf, inBuf);

        delete[] lReplayBuffer;
    }

    return EXIT_SUCCESS;
}

Here is the UnCompress function (not written by me, was an example I got from the Internet):

static void UnCompress(std::vector<unsigned char> &outBuf, const std::vector<unsigned char> &inBuf)
{
    outBuf.resize(UNCOMPRESSED_SIZE);
    unsigned dstLen = outBuf.size();
    unsigned srcLen = inBuf.size() - LZMA_PROPS_SIZE;
    SRes res = LzmaUncompress(&outBuf[0], &dstLen, &inBuf[LZMA_PROPS_SIZE], &srcLen, &inBuf[0], LZMA_PROPS_SIZE);
    outBuf.resize(dstLen); // If uncompressed data can be smaller
}  

The file starts with the following bytes: 5D 00 00 20 00 B6 EC 07 00
or in ASCII:                                    ]   .    .       .    ¶   ì   .    .

LZMA_PROPS_SIZE is always 5.

As you can see in the UnCompress function, inBuf gets passed to the LzmaUncompress() function with the offset LZMA_PROP_SIZE, which is probably the header?.

I debugged the code and found out, that in a subroutine of LzmaUncompress() it gets checked if inBuf[0] != 0 and if so, it returns SZ_ERROR_DATA.

Screenshot where the error happens

As you can see, the Byte of p->tmpBuf[0] (that's the inBuf) is ¶ which is in hex B6. That's the 6th Byte of inBuf, because inBuf + LZMA_PROP_SIZE(5).

I really don't know much about LZMA, but why does the first Byte after LZMA_PROP_SIZE have to be 0, and why can lzmadecompress.exe uncompress it, when it uses the same function?

What am I doing wrong?

like image 261
Lyan Avatar asked Oct 17 '22 18:10

Lyan


1 Answers

SOLUTION

The next 8 Bytes after LZMA_PROP_SIZE is the size of the uncompressed data and therefore part of the header. The program failed, because I tried to decode the file with the first 8 bytes being part of the header.

To solve the problem, I just had to edit those 2 lines:

unsigned srcLen = inBuf.size() - LZMA_PROPS_SIZE - 8;
SRes res = LzmaUncompress(&outBuf[0], &dstLen, &inBuf[LZMA_PROPS_SIZE + 8], &srcLen, &inBuf[0], LZMA_PROPS_SIZE);
  • Subtract 8 Byte from the srcLen
  • Add 8 Byte to the offset where LzmaUncompress() starts to decode.
like image 144
Lyan Avatar answered Oct 21 '22 05:10

Lyan