Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would ETags set to a MUST requirement if you already have the resource?

Why would you set ETags to a "MUST requirement level"?

You obtains the resource before the ETags returned...

I'm working on a project where I am the client that sends HTTP requests to a server that returns an HTTP Cache-Control header with ETags to cache response (where in each addition request it gets compared to the If-None-Match header to determine if the data is stale and if a new request should be made). In my current project the ETags parameter is using the conditional GET architecture with the MUST requirement level as specified in RFC 2119.

MUST This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification. I don't understand the intent of using a conditional GETwith the MUST requirement level? From my understanding the MUST requirement is there to limit (is that right?) the resources provided to the client that makes the request, however the client (me in this case) already has the resources from the first request. Where I can continue obtaining the same resource (or a fresher resource if it gets updated) as much as I want with or without returning the If-None-Match and ETag header fields.

What would be the purpose of setting it to the MUST requirement level in this case if it's not limiting the resources returned, Aside from being able to cache and limiting the amount of requests to the server (Im asking from the client point of view, yes I know I can cache it but why the MUST requirement)? Isn't this only used for limiting resources?

So basically, doesn't it make this MUST requirement not a requirement if I can obtain the resources with or without it? Am I missing something here?

My Question is not asking the what and how Etags, Cache-Control, or If-None-Match headers work.

Thanks in advance, cheers!

like image 748
garrettmac Avatar asked Oct 29 '15 20:10

garrettmac


People also ask

Is ETag required?

The use of ETags in the HTTP header is optional (not mandatory as with some other fields of the HTTP 1.1 header). The method by which ETags are generated has never been specified in the HTTP specification.

Why are ETags used?

An ETag (entity tag) is an HTTP header that is used to validate that the client (such as a mobile device) has the most recent version of a record. When a GET request is made, the ETag is returned as a response header. The ETag also allows the client to make conditional requests.

Is ETag a weak validator?

ETag supports strong and weak validation of the resource. Strong ETag indicates that resource content is same for response body and the response headers. Weak ETag indicates that the two representations are semantically equivalent. It compares only the response body.

How does ETag caching work?

The ETag (or Entity Tag) works in a similar way to the Last-Modified header except its value is a digest of the resources contents (for instance, an MD5 hash). This allows the server to identify if the cached contents of the resource are different to the most recent version.


2 Answers

Why would ETags set to a MUST requirement if you already have the resource?

A client MUST use a conditional GET to reduce the data traffic.

Aside from being able to cache and limiting the amount of requests to the server

The number of requests stays the same, but the total number of data transferred changes.


Using ETags in if-none-matched GET requests (conditional GET)

  1. When you make a API call, the response header includes an ETag with a value that is the hash of the data returned in the API call. You store this ETag value for use in the next request.
  2. The next time you make the same API call, you include the If-None-Match request header with the ETag value stored from the first step.
    • If the data has not changed, the response status code will be 304 – Not Modified and no data is returned.
    • If the data has changed since the last query, the data is returned as usual with a new ETag. The game starts again: you store the new ETag value and use it for subsequent requests.

Why?

  • The main reason for using conditional GET requests is to reduce data traffic.

Isn't this only used for limiting resources?

No...

  • You can ask an API for multiple resources in one request.
    • (Ok, thats also limiting resources by saving the other requests.)
  • You can prevent a method (e.g. PUT) from modifying an existing resource, when the client believes that the resource does not exist (replace protection).

I can obtain the resources with or without it?

When you ignore the "MUST use conditional GET" then (a) the traffic will increase and (b) you lose the "resource has changed" indication coming from server-side. You would have to implement the comparison handling on client side: is the resource of the second request newer than the one from the first request.

like image 158
Jens A. Koch Avatar answered Sep 27 '22 20:09

Jens A. Koch


I found my question wasn't asking the "right question" due to me merging my understand of other headers (thanks to @dcerecedo's comment to get my pointed in the right direction) that were affecting my understand of why MUST was being used.

The MUST was more relivent to other headers, in my case private, max-age=3600 and must-revalidate

Where

  1. Cache-Control: private restricts proxy servers from caching it, this helps you keep your data off a server you dont trust and prevents a proxy from caching user specific data that’s not relevant to everyone (like a user profile).

  2. Cache-Control "max-age=3600, must-revalidate" tell both client caches and proxy caches that once the content is stale (older than 3600 seconds) they must revalidate at the origin server before they can serve the content. This should be the default behavior of caching systems, but the must-revalidate directive makes this requirement unambiguous.

Where after the max-age expires the client should revalidate. It might revalidate using the If-Match or If-None-Match headers with an ETag, or it might use the If-Modified-Since or If-Unmodified-Since headers with a date. So, after expiration the browser will check at the server if the file is updated. If not, the server will respond with a 304 Not Modified header and nothing is downloaded.

like image 40
garrettmac Avatar answered Sep 27 '22 20:09

garrettmac