Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Leveraging etags and chunked encoding at the same time?

updated question

How can my application leverage etags, and does introducing streaming/chunked encoding introduce any complications?


original question

When doing HTTP streaming with Transfer-Encoding: chunked, Content-Length can't be sent because it often is not known.

To my understanding, when browsers leverage etags they require knowing Content-Length. If an etag is provided but not Content-Length, browsers will never send If-None-Match.

Is there a way around this?

like image 252
John Bachir Avatar asked Nov 14 '13 07:11

John Bachir


People also ask

Why is encoding chunked?

Chunked encoding is useful when larger amounts of data are sent to the client and the total size of the response may not be known until the request has been fully processed. For example, when generating a large HTML table resulting from a database query or when transmitting large images.

How do you stop transfer encoding chunked?

Try adding "&headers=false" to your request. That should shorten it up and cause the response to be less likely to be chunked. Also, are you sending a HTTP/1.1 or HTTP/1.0 request? Try sending a HTTP/1.0 if your device cannot handle a HTTP/1.1 request.

What type of encoding is chunking?

Chunked transfer encoding is a streaming data transfer mechanism available in version 1.1 of the Hypertext Transfer Protocol (HTTP). In chunked transfer encoding, the data stream is divided into a series of non-overlapping "chunks". The chunks are sent out and received independently of one another.

How do I enable chunked transfer encoding?

To enable chunked transfer encoding, you need to set the value of AspEnableChunkedEncoding to "True" in the metabase of the site, server, or virtual directory for which chunked transfer encoding is enabled. By default this value is set to "True", you can try to change the value to "False" to disable it.


1 Answers

What are entity tags?

Etags are http headers used to version pages and allows the client to reuse previously cached copies of a page, if the page have not changed.

The basic idea is that the client goes to a page and sends an http request to the server that has the page. The server then renders the page and returns the response to the client along with an etag that holds some value. In addition to showing the page, the client will file a copy of that page in its local cache along with the etag. The next time the client visits that page, the client will issue a request to the webserver but include the etag in an If-None-Match header. Such a request is known as an conditional GET. The client is saying, "I would like this page, however I already have a cached version of the page with this etag value, so if you think that my cached version is current, just tell me that, and I'll just show my cached copy to the user".

There aren't any semantic requirements for the etag value. It should be used to store a value that allows you to determine if the clients copy is up to date.

The simplest way to do this is to calculate a hash of your response and if the hash matches the etag value in the request headers, then the client already holds an identical copy and you can return a 304 No content and return an empty body in the response. This is much faster than returning the entire page again.

Optimization

While calculating a hash is a simple and safe way to determine if the cache is still good, more intelligent techniques exist that will allow you to reduce the load on your webserver. Consider a page that displays a product in a webshop. Instead of rendering the page with the product description and then computing and comparing the hash, you could just use the product's updated_at attribute. This means that the first thing you do in your application is check the etag and fetch the product from the database to compare the updated_at attribute. If that matches, you assume the product's details have not been changed and you can finish the request processing without doing anything further and then return the 304 No content response.

However, you should be careful with this kind of optimization, as there may be additional content on the page that can be become outdated without affecting the updated_at attribute of the product in your database. This could be a sidebar with the latest news or worse, a personalized part of the page such as a shopping cart listing previously added products.

Chunked Encoding

Chunked encoding is merely a technique to transfer a response in multiple chunks, so the receiving client can start rendering the page faster while the server is still working on the remaining chunks. It does not have anything to do with caching. However, if you want to use the hashed value of the response as the etag, that is obviously not possible as the headers are sent before you know the full response, which are required to calculate the hash.

like image 76
Niels B. Avatar answered Sep 21 '22 06:09

Niels B.