Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XMLHttpRequest and Chrome developer tools don't say the same thing

I'm downloading a ~50MB file in 5 MB chunks using XMLHttpRequest and the Range header. Things work great, except for detecting when I've downloaded the last chunk.

Here's a screenshot of the request and response for the first chunk. Notice the Content-Length is 1024 * 1024 * 5 (5 MB). Also notice that the server responds correctly with the first 5 MB, and in the Content-Range header, properly specifies the size of the entire file (after the /):

first chunk

When I copy the response body into a text editor (Sublime), I only get 5,242,736 characters instead of the expected 5,242,880 as indicated by Content-Length:

actual length

Why are 144 characters missing? This is true of every chunk that gets downloaded, though the exact difference varies a little bit.

However, what's especially strange is the last chunk. The server responds with the last ~2.9 MB of the file (instead of a whole 5 MB) and apparently properly indicates this in the response:

last chunk

Notice that I am requesting the next 5 MB (even though it goes beyond the total file size). No biggie, the server responds with the last part of the file and the headers indicate the actual byte range returned.

But does it really?

When I call xhr.getResponseHeader("Content-Length") with Javascript, I see a different story in Chrome:

Chrome dev tools don't agree

The XMLHttpRequest object is telling me that another 5 MB was downloaded, beyond the end of the file. Is there something I don't understand about the xhr object?

What's even weirder is that it works in Firefox 30 as expected:

Firefox XHR works

So between the xhr.responseText.length not matching the Content-Length and these headers not agreeing between the xhr object and the Network tools, I don't know what to do to fix this.

What's causing these discrepancies?

Update: I have confirmed that the server itself is properly sending the request, despite the overshot Range header in the request for the last chunk. This is the output from the raw HTTP request, thanks to good 'ol telnet:

HTTP/1.1 206 Partial Content
Server: nginx/1.4.5
Date: Mon, 14 Jul 2014 21:50:06 GMT
Content-Type: application/octet-stream
Content-Length: 2987360
Last-Modified: Sun, 13 Jul 2014 22:05:10 GMT
Connection: keep-alive
ETag: "53c30296-2fd9560"
Content-Range: bytes 47185920-50173279/50173280

So it looks like Chrome is malfunctioning. Should this be filed as a bug? Where?

like image 998
Matt Avatar asked Jul 14 '14 20:07

Matt


1 Answers

The main issue is that you are reading binary data as text. Note that the server responds with Content-Type: application/octet-stream which doesn't specify the encoding explicitly - in that case the browser will typically assume that the data is encoded in UTF-8. While the length will mostly be unchanged (bytes with values 0 to 127 are interpreted as a single character in UTF-8 and bytes with higher values will usually be replaced by the replacement character �), your binary file will certainly contain a few valid multi-byte UTF-8 sequences - and these will be combined into one character. That explains why responseText.length doesn't match the number of bytes received from the server.

Now you could of course force some specific encoding using request.overrideMimeType() method, ISO 8859-1 would make sense in particular because the first 256 Unicode code points are identical with ISO 8859-1:

request.overrideMimeType("application/octet-stream; charset=iso-8859-1");

That should make sure that one byte will always be interpreted as one character. Still, a better approach would be storing the server response in an ArrayBuffer which is explicitly meant to deal with binary data.

var request = new XMLHttpRequest();
request.open(...);
request.responseType = "arraybuffer";
request.send();

...

var array = new Uint8Array(request.response);
alert("First byte has value " + array[0]);
alert("Array length is " + array.length);

According to MDN, responseType = "arraybuffer" is supported starting with Chrome 10, Firefox 6 and Internet Explorer 10. See also: Typed arrays.

Side-note: Firefox also supports responseType = "moz-chunked-text" and responseType = "moz-chunked-arraybuffer" starting with Firefox 9 which allow receiving data in chunks without resorting to ranged requests. It seems that Chrome doesn't plan to implement it, instead they are working on implementing the Streams API.

Edit: I was unable to reproduce your issue with Chrome lying to you about the response headers, at least not without your code. However, the code responsible should be this function in partial_data.cc:

// We are making multiple requests to complete the range requested by the user.
// Just assume that everything is fine and say that we are returning what was
// requested.
void PartialData::FixResponseHeaders(HttpResponseHeaders* headers,
                                     bool success) {
  if (truncated_)
    return;

  if (byte_range_.IsValid() && success) {
    headers->UpdateWithNewRange(byte_range_, resource_size_, !sparse_entry_);
    return;
  }

This code will remove the Content-Length and Content-Range headers returned by the server and replace them by ones generated from your request parameters. Given that I cannot reproduce the issue myself, the following is only guesses:

  • This code path seems to be used only for requests that can be satisfied from cache, so I guess that things will work correctly if you clear your cache.
  • resource_size_ variable must have a wrong value in your case, larger than the actual size of the requested file. This variable is determined from the Content-Range header in the first chunk requested, maybe you have a server response cached there which indicates a larger file.
like image 129
Wladimir Palant Avatar answered Oct 07 '22 10:10

Wladimir Palant