Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache HttpClient response content length returns -1

Why does the following Code returns -1? Seems that the request failed.

public static void main(String[] args)
{
    DefaultHttpClient httpClient = new DefaultHttpClient();
    HttpGet httpGet = new HttpGet("http://www.google.de");

    HttpResponse response;
    try
    {
        response = httpClient.execute(httpGet);
        HttpEntity entity = response.getEntity();
        EntityUtils.consume(entity);

        // Prints -1
        System.out.println(entity.getContentLength());
    }
    catch (ClientProtocolException e)
    {
        e.printStackTrace();
    }
    catch (IOException e)
    {
        e.printStackTrace();
    }
    finally
    {
        httpGet.releaseConnection();
    }
}

And is it possible to get the response as String?

like image 664
PeterMoron Avatar asked Sep 10 '13 19:09

PeterMoron


3 Answers

Try running

Header[] headers = response.getAllHeaders();
for (Header header : headers) {
    System.out.println(header);
}

It will print

Date: Tue, 10 Sep 2013 19:10:04 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=dad7e2356ddb3b7a:FF=0:TM=1378840204:LM=1378840204:S=vQcLzVPbOOTxfvL4; expires=Thu, 10-Sep-2015 19:10:04 GMT; path=/; domain=.google.de
Set-Cookie: NID=67=S11HcqAV454IGRGMRo-AJpxAPxClJeRs4DRkAJQ5vI3YBh4anN3qS0EVeiYX_4XDTGN-mY86xTBoJ3Ncca7eNSdtGjcaG31pbCOuqsZEQMWwKn-7-6Dnizx395snehdA; expires=Wed, 12-Mar-2014 19:10:04 GMT; path=/; domain=.google.de; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked

This is not a problem, the page you requested simply doesn't provide a Content-Length header in its response. As such, the HttpEntity#getContentLength() returns -1.

EntityUtils has a number of methods, some of which return a String.


Running curl more recently produces

> curl --head http://www.google.de
HTTP/1.1 200 OK
Date: Fri, 03 Apr 2020 15:38:18 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2020-04-03-15; expires=Sun, 03-May-2020 15:38:18 GMT; path=/; domain=.google.de; Secure
Set-Cookie: NID=201=H8GdKY8_vE5Ehy6qSkmQru13HqdGEj2tvZUFqvTDAVBxFoL4POI0swPtfI45v1TBjrJuAAfbcNMUddniIf9HHituCAFwUqmUFMDwxDYK5qUlcWiB1A64OcGp6PTT6LKur2r_3z-ToSvLf8RZhKWdny6E8SaArMpkaOqUEWp4aoQ; expires=Sat, 03-Oct-2020 15:38:18 GMT; path=/; domain=.google.de; HttpOnly
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding

The headers contain a Transfer-Encoding value of chunked. With chunked, the response contains "chunks" preceded by their length. An HTTP client uses those to read the entire response.

The HTTP Specification states that the Content-Length header should not be present when Transfer-Encoding has a value of chunked and MUST be ignored if it is.

like image 69
Sotirios Delimanolis Avatar answered Sep 20 '22 11:09

Sotirios Delimanolis


Please notice that response header name Transfer-Encoding. Its value is chunked which means data is deliveryed block by block. Transfer-Encoding: chunked and Content-Length does not turn out at the same time. There are two reason.

  1. Server does not want sent content length.
  2. Or server do not know the content length when it flush a big size data whose size is large than server's buffer.

So when there is no content length header, you can find the size of each chunked block before body of content. For example:

HTTP/1.1 200 OK

Server: Apache-Coyote/1.1

Set-Cookie: JSESSIONID=8A7461DDA53B4C4DD0E89D73219CB5F8; Path=/

Content-Type: text/html;charset=UTF-8

Transfer-Encoding: chunked

Date: Wed, 18 Mar 2015 07:10:05 GMT

11

helloworld!

3

123

0

Above headers and content tell us, there are two block data. The size of first block is 11. the size of second block is 3. So the content length is 14 at all.

regards, Xici

like image 22
xici Avatar answered Sep 18 '22 11:09

xici


If you really want to get the content length without caring about the content, you can do this.

EntityUtils.toByteArray(httpResponse.getEntity()).length

like image 23
gaojun1000 Avatar answered Sep 21 '22 11:09

gaojun1000