Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I avoid zlib "unexpected end of file" when GUnzipping partial files?

Tags:

node.js

zlib

I'm trying to read part of a gzipped file while decompressing it so I can parse the header contents without reading uneccessary bytes. I had this working previously using fs.read() while passing options to only read the first 500 bytes then using zlib.gunzip() to decompress the contents before parsing the header from the binary data.

This was working fine until node v5.0.0 patched a bug to ensure zlib throws an error on a truncated input (https://github.com/nodejs/node/pull/2595).

Now I'm getting the following error from zlib.

Error: unexpected end of file

How can I unzip this partial file knowing that I'm truncating the input without throwing an error. I was thinking it might be easier with streams so I wrote the following.

var readStream = fs.createReadStream(file.path, {start: 0, end: 500});
var gunzip = zlib.createGunzip();

readStream.pipe(gunzip)
    .on('data', function(chunk) {
        console.log(parseBinaryHeader(chunk));
        console.log('got %d bytes of data', chunk.length);
    })
    .on('error', function (err) {
        console.log(err);
    })
    .on('end', function() {
        console.log('end');
    });

My parseBinaryHeader() function is returning the correct header content so I know it's unzipping but it's still throwing an error when it hits the end of the input. I can add the error listener to handle the error and do nothing with it, but this doesn't seem ideal.

Any ideas?

like image 573
Constellates Avatar asked Nov 25 '15 21:11

Constellates


1 Answers

Thanks for all the suggestions. I also submitted a question issue to the node repository and got some good feedback. Here's what ended up working for me.

  • Set the chunk size to the full header size.
  • Write the single chunk to the decompress stream and immediately pause the stream.
  • Handle the decompressed chunk.

example

var bytesRead = 500;
var decompressStream = zlib.createGunzip()
    .on('data', function (chunk) {
        parseHeader(chunk);
        decompressStream.pause();
    }).on('error', function(err) {
        handleGunzipError(err, file, chunk);
    });

fs.createReadStream(file.path, {start: 0, end: bytesRead, chunkSize: bytesRead + 1})
    .on('data', function (chunk) {
        decompressStream.write(chunk);
    });

This has been working so far and also allows me to keep handling all other gunzip errors as the pause() prevents the decompress stream from throwing the "unexpected end of file" error.

like image 173
Constellates Avatar answered Sep 20 '22 01:09

Constellates