Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why isn't ETag alone enough to invalidate the browser cache?

Tags:

http

caching

etag

I've read a lot of related articles on the matter and also the very good article about HTTP caching here: https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching?hl=en#invalidating-and-updating-cached-responses but it is still not clear to me:

Why isn't sending an ETag header enough to invalidate the browser cache for a particular resource? Why does everyone recommend actually changing the URL/filename of the resource to force the browser to re-download the file? If the browser has already cached the file with a particular ETag and the ETag is modified on the server, wouldn't that suffice?

like image 649
AsGoodAsItGets Avatar asked May 21 '15 09:05

AsGoodAsItGets


Video Answer


1 Answers

I find the following pages helpful:

  • https://jakearchibald.com/2016/caching-best-practices/
  • https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control
  • https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag

This line from MDN's ETag page shares the key point (emphasis added):

If a user visits a given URL again (that has an ETag set), and it is stale, that is too old to be considered usable, the client will send the value of its ETag along in an If-None-Match header field...

The ETag will be used by the client to revalidate resources once they become "stale". But what constitutes "stale"?

This is where the Cache-Control header comes in handy. The Cache-Control header can be sent with a response to let the client know how long the client may cache an item until it should be considered stale. For example, Cache-Control: no-cache would indicate that the resource should be considered stale immediately. See the MDN Cache-Control page for more information on available Cache-Control values.

When the browser attempts to process a request for a cached resource that is considered stale, it will first send a revalidation request to the server with the resource's last ETag value included via the If-None-Match request header, as described on MDN's ETag page. It can also use the Last-Modified response header sent as the If-Modified-Since request header as a secondary option.

If the server determines that the client's ETag value (in the If-None-Match request header) is current, then it will respond with a 304 (Not Modified) HTTP status code and an empty body, indicating that the client can use the cached entry. Otherwise, the server will respond with a 200 HTTP status code and the new response body.

Other resources:

  • Difference between no-cache and must-revalidate
  • What's default value of cache-control?

To answer your questions directly:

  • Why isn't sending an ETag header enough to invalidate the browser cache for a particular resource? -- Because the ETag header is not validated until the cached entry is considered stale, such as via an expiration date set in the Cache-Control response header.
  • Why does everyone recommend actually changing the URL/filename of the resource to force the browser to re-download the file? -- Changing the URL/filename or adding a query string will force the client to avoid using a cache. This is simple and is a virtually guaranteed way of cache-busting. This does not mean it's necessary, but it tends to be safe in the realm of inconsistent browser behaviors.
  • If the browser has already cached the file with a particular ETag and the ETag is modified on the server, wouldn't that suffice? -- Technically it should suffice as long as the appropriate Cache-Control headers (including the Pragma and Expires headers) are included. See How to control web page caching, across all browsers? for more details.
like image 84
Mike Hill Avatar answered Oct 30 '22 05:10

Mike Hill