Uncompress GZIPed HTTP Response in Java

Question

I'm trying to uncompress a GZIPed HTTP Response by using GZIPInputStream. However I always have the same exception when I try to read the stream : java.util.zip.ZipException: invalid bit length repeat

My HTTP Request Header:

GET www.myurl.com HTTP/1.0

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7

Keep-Alive: 115

Connection: keep-alive

X-Requested-With: XMLHttpRequest

Cookie: Some Cookies

At the end of the HTTP Response header, I get path=/Content-Encoding: gzip, followed by the gziped response.

I tried 2 similars codes to uncompress :

UPDATE : In the following codes, tBytes = (the string after 'path=/Content-Encoding: gzip').getBytes ();

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

StringBuffer  szBuffer = new StringBuffer ();

byte  tByte [] = new byte [1024];

while (true)
{
    int  iLength = gzip.read (tByte, 0, 1024); // <-- Error comes here

    if (iLength < 0)
        break;

    szBuffer.append (new String (tByte, 0, iLength));
}

And this one that I get on this forum :

InputStream     gzipStream = new GZIPInputStream   (new ByteArrayInputStream (tBytes));
Reader          decoder    = new InputStreamReader (gzipStream, "UTF-8");//<- I tried ISO-8859-1 and get the same exception
BufferedReader  buffered   = new BufferedReader    (decoder);

I guess this is an encoding error.

Best regards,

bill0ute

Wim Coenen · Accepted Answer

You don't show how you get the tBytes that you use to set up the gzip stream here:

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

One explanation is that you are including the entire HTTP response in tBytes. Instead, it should be only the content after the HTTP headers.

Another explanation is that the response is chunked.

edit: You are taking the data after the content-encoding line as the message body. However, according to the HTTP 1.1 specification the header fields do not come in any particular order, so this is very dangerous.

As explained in this part of the HTTP specification, the message body of a request or response doesn't come after a particular header field but after the first empty line:

Request (section 5) and Response (section 6) messages use the generic message format of RFC 822 [9] for transferring entities (the payload of the message). Both types of message consist of a start-line, zero or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and possibly a message-body.

You still haven't show how exactly you compose tBytes, but at this point I think you're erroneously including the empty line in the data that you try to decompress. The message body starts after the CRLF characters of the empty line.

May I suggest that you use the httpclient library instead to extract the message body?

Thusitha Nuwan · Answer

Well there is the problem I can see here;

int  iLength = gzip.read (tByte, 0, 1024);

Use following to fix that;

        byte[] buff = new byte[1024];
byte[] emptyBuff = new byte[1024];
                            StringBuffer unGzipRes = new StringBuffer();

                            int byteCount = 0;
                            while ((byteCount = gzip.read(buff, 0, 1024)) > 0) {
                                // only append the buff elements that
                                // contains data
                                unGzipRes.append(new String(Arrays.copyOf(
                                        buff, byteCount), "utf-8"));

                                // empty the buff for re-usability and
                                // prevent dirty data attached at the
                                // end of the buff
                                System.arraycopy(emptyBuff, 0, buff, 0,
                                        1024);
                            }

Uncompress GZIPed HTTP Response in Java

Tags:

java

compression

gzip

httpresponse

bill0ute

2 Answers

Wim Coenen

Thusitha Nuwan

Recent Activity

Donate For Us

Uncompress GZIPed HTTP Response in Java

Tags:

java

compression

gzip

httpresponse

bill0ute

2 Answers

Wim Coenen

Thusitha Nuwan

Related questions

Recent Activity

Donate For Us