Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Uncompress gzip compressed http response

Tags:

php

I'm using php's file_get_contents() function to do a HTTP request. To save bandwidth I decided to add the "Accept-Encoding: gzip" header using stream_context_create().

Obviously, file_get_contents() outputs a gzip encoded string so I'm using gzuncompress() to decode the encoded string but I get an error with data passed as argument.

[...] PHP Warning: gzuncompress(): data error in /path/to/phpscript.php on line 26

I know there is another function able to decompress gzipped data gzdecode() but it isn't included in my PHP version (maybe it is only available on SVN).

I know that cUrl decodes gzip stream on the fly (without any problem) but someone suggested me to use file_get_contents() instead of cUrl.

Do you know any other way to decompress gzipped data in PHP or why gzuncompress() outputs a Warning? It is absurd that gzuncompress() doesn't work as expected.

Notes: The problem is certainly about PHP: the HTTP request is made to Tumblr API that give a well-encoded response.

like image 935
Fabio Buda Avatar asked Jan 17 '12 13:01

Fabio Buda


People also ask

Do browsers automatically decompress gzip?

If the gzip compression is enabled on the web server, that is, not in the application logic, then the browser will uncompress automatically.

What is HTTP response compression?

Compression is an important way to increase the performance of a Web site. For some documents, size reduction of up to 70% lowers the bandwidth capacity needs. Over the years, algorithms also got more efficient, and new ones are supported by clients and servers.

What is gzip HTTP?

Gzip is a file format and software application used on Unix and Unix-like systems to compress HTTP content before it's served to a client.


2 Answers

Found this working for me: http://www.php.net/manual/en/function.gzdecode.php#106397

Optionally try: http://digitalpbk.com/php/file_get_contents-garbled-gzip-encoding-website-scraping

if ( ! function_exists('gzdecode'))
{
    /**
     * Decode gz coded data
     * 
     * http://php.net/manual/en/function.gzdecode.php
     * 
     * Alternative: http://digitalpbk.com/php/file_get_contents-garbled-gzip-encoding-website-scraping
     * 
     * @param string $data gzencoded data
     * @return string inflated data
     */
    function gzdecode($data) 
    {
        // strip header and footer and inflate

        return gzinflate(substr($data, 10, -8));
    }
}
like image 131
Mike Avatar answered Oct 13 '22 07:10

Mike


gzuncompress won't work for the gzip encoding. It's the decompression function for the .Z archives.

The manual lists a few workarounds for the missing gzdecode()#82930, or just use the one from upgradephp, or the gzopen temp file workaround.

Another option would be forcing the deflate encoding with the Accept-Encoding: header and then using gzinflate() for decompression.

like image 23
mario Avatar answered Oct 13 '22 06:10

mario