Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why PHP's gzuncompress() function can go wrong?

Tags:

php

gzip

unzip

PHP has its own function to work with gzip archives. I wrote the following code:

error_reporting(E_ALL);
$f = file_get_contents('http://spiderbites.nytimes.com/sitemaps/www.nytimes.com/sitemap.xml.gz');
echo $f;
$f = gzuncompress($f);
echo "<hr>";
echo $f;

First echo normally outputs the compressed file with proper header (at least first two bytes are correct). If I'd download this file with my browser I can unzip it easily.

However gzuncompress thrown Warning: gzuncompress(): data error in /home/path/to/script.php on line 5

Can anyone point me to the right direction to solve this problem?

EDIT:

The part of phpinfo() output

enter image description here

like image 545
Vlada Katlinskaya Avatar asked Dec 09 '22 05:12

Vlada Katlinskaya


2 Answers

Or you could just use the right decompression function, gzdecode().

like image 143
mario Avatar answered Dec 11 '22 11:12

mario


Note that gzuncompress() may not decompress some compressed strings and return a Data Error.

The problem could be that the outside compressed string has a CRC32 checksum at the end of the file instead of Adler-32, like PHP expects.

(http://php.net/manual/en/function.gzuncompress.php#79042)

That could be an option of why it does not work.

Try with his code:

function gzuncompress_crc32($data) {
     $f = tempnam('/tmp', 'gz_fix');
     file_put_contents($f, "\x1f\x8b\x08\x00\x00\x00\x00\x00" . $data);
     return file_get_contents('compress.zlib://' . $f);
}

Modify your code in this:

error_reporting(E_ALL);
$f = file_get_contents('http://spiderbites.nytimes.com/sitemaps/www.nytimes.com/sitemap.xml.gz');
echo $f;
$f = gzuncompress_crc32($f);
echo "<hr>";
echo $f;

As far as I have tested locally, it does not give the error anymore.

like image 42
GiamPy Avatar answered Dec 11 '22 11:12

GiamPy