Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can content-encoding be ignored

I have a device I need to download a file from. In certain cases, the file may have an incorrect content-encoding. Particularly, it may have a content-encoding of "gzip", when it is not gzipped, or compressed in any way.

So, when the file is gzipped, it's simple to get the content using a basic ajax GET:

$.ajax({
    url: 'http://' + IP + '/test.txt',
    type: 'GET'
})
.done(function(data) {
    alert(data);
});

But this fails, as you might expect, when the content-encoding is wrong.

To be clear, I'm not looking for a solution to bypass the ERR_CONTENT_DECODING_FAILED when simply navigating to the given url in a browser. I want to be able to load, for instance, a csv, into a string in javascript for further parsing.

Can I GET the file, and force it to skip attempting decoding, or override the content-encoding of the response, or some such?

like image 970
femtoRgon Avatar asked Mar 31 '15 18:03

femtoRgon


People also ask

What is content encoding?

Content encoding is mainly used to compress the message data without losing information about the origin media type. Note that the original media/content type is specified in the Content-Type header, and that the Content-Encoding applies to the representation, or "coded form", of the data.

How do I disable gzip content encoding?

When gzip is enabled you will see 'Content-Encoding: gzip' in the output: Note: If the website has a www redirection, change the website name to www.example.com. Connect to a server via SSH. To disable gzip compression, open the corresponding file in a text editor and change gzip on to gzip off .

What is content encoding DEFLATE?

Due to patent issue, many modern browsers don't support this type of content-encoding. deflate: This format uses zlib structure with deflate compression algorithm. br: It is a compression format using the Brotli algorithm. identity: It is used to indicate that there is no compression.

What does accept-encoding mean?

The Accept-Encoding request HTTP header indicates the content encoding (usually a compression algorithm) that the client can understand. The server uses content negotiation to select one of the proposals and informs the client of that choice with the Content-Encoding response header.


1 Answers

This is simply not possible to do via client-side JavaScript, per the WHATWG's XHR spec, which makes use of the fetch operation from the WHATWG Fetch Standard.

Client-side scripts can only read the response object supplied by the browser environment. The Fetch Standard defines how the browser environment must build a response object's body attribute in step 2 of the fetch operation (note especially substeps 2 through 4):

  1. Whenever one or more bytes are transmitted, let bytes be the transmitted bytes and run these subsubsteps:

    1. Increase response's body's transmitted with bytes' length.

    2. Let codings be the result of parsing Content-Encoding in response's header list.

    3. Set bytes to the result of handling content codings given codings and bytes.

    4. Push bytes to response's body.

Where the action handling content codings is:

To handle content codings given codings and bytes, run these substeps:

  1. If codings are not supported, return bytes.

  2. Return the result of decoding bytes with the given codings as explained in HTTP.

From this definition, we can see that a response object never exposes encoded bytes in its body property. Before bytes can be added to the body, they must first be decoded. A client script never has access to what the spec calls "transmitted bytes" (i.e., the actual encoded bytes sent over the wire).

Decoding is determined exclusively by the Content-Encoding header. There is no mechanism by which client-side JavaScript can manipulate the response headers of a response object, so Content-Encoding must be whatever the server originally sent.

What your server is doing is wrong. Your only options are:

  1. Fix the behavior of the server.

  2. Run the HTTP response through a proxy that fixes the Content-Encoding response header before it reaches your client.

like image 89
apsillers Avatar answered Oct 22 '22 16:10

apsillers