Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTTP: Combining expiration and validation caching

Tags:

I'm having trouble formulating HTTP cache headers for the following situation.

Our server has large data that changes perhaps a couple times a week. I want browsers to cache this data. Additionally, I want to minimize latency from conditional gets as the network is unreliable.

The end behavior I'm after is this:

  1. Client requests a resource it hasn't seen before.
  2. Server responds with resource along with ETag and max-age (24 hours).
  3. Until 24 hours has passed, client will use cached resource.
  4. After the expiration date, client will perform a validate request (If-None-Match: [etag])
  5. If resource has not changed:
    • server responds with 304 Not Modified
    • client is somehow informed that the existing resource has a new expiration date 24 hours from now
    • return to step 3

Boiled down to its essense... can a 304 response contain a new max-age? Or is the original max-age honored for subsequent requests?

like image 335
roufamatic Avatar asked Jun 14 '11 21:06

roufamatic


People also ask

Which HTTP caching header is used to avoid making requests to the origin server?

Expiration is the caching mechanism by which a client can entirely avoid making requests to the origin server. When the origin server specifies an explicit expiration time in the resource, a cache can check that expiration time and respond accordingly without having to contact the server first.

Which of the following type of HTTP request will never get cached?

But HTTP caching is applicable only to idempotent requests, which makes a lot of sense; only idempotent and nullipotent requests yield the same result when run multiple times. In the HTTP world, this fact means that GET requests can be cached but POST requests cannot.

What does a HTTP caching proxy do if it has a cache miss?

Second, a caching proxy often functions as a second (or higher) level cache, getting only the misses left over from Web clients that use a per-client cache (e.g., Mosaic and Netscape). The misses passed to the proxy-server from the client usually do not contain a document requested twice by the same user.

Which method is used to deal with stale cache problem is HTTP proxy?

must-revalidate HTTP allows caches to reuse stale responses when they are disconnected from the origin server. must-revalidate is a way to prevent this from happening - either the stored response is revalidated with the origin server or a 504 (Gateway Timeout) response is generated.


1 Answers

Yes, a 304 response can contain a new max-age (or ETag, or other response headers for that matter).

I did an experiment using Firefox 4 to test whether the original max-age or the new one is honored, and the answer was that the new max-age is honored, so you should be able to implement what you want to do.

It's important to remember that max-age is relative to the Date response header, not Last-Modified, so whenever your server sets a max-age directive of 24 hours, it is saying "24 hours from right now." So, assuming that's what you want, you won't have to change your max-age at all, just always return 86400.

Anyway, here's an overview and dump of my experiment. Basically, I hit a test URL that set an ETag and set max-age to 120 seconds. Accordingly, the server returned the page with these response headers:

HTTP/1.1 200 OK
Date: Tue, 14 Jun 2011 23:48:51 GMT
Cache-Control: max-age=120
Etag: "901ea3d0ac9303ae4855a09676f96701"
Last-Modified: Mon, 13 Jun 2011 22:20:03 GMT

I then repeated hitting "enter" in the address bar to load the page (but not force a hard reload). There was no network traffic, since Firefox repeatedly reloaded the page from cache. Then, after 120 seconds were over, the very next time I hit enter, Firefox instead sent a conditional GET to the server, as you would expect. The request and response from the server were:

GET /example HTTP/1.1
If-Modified-Since: Mon, 13 Jun 2011 22:20:03 GMT
If-None-Match: "901ea3d0ac9303ae4855a09676f96701"

HTTP/1.1 304 Not Modified
Date: Tue, 14 Jun 2011 23:50:54 GMT
Etag: "901ea3d0ac9303ae4855a09676f96701"
Cache-Control: max-age=240

Note that in the 304 response, I've had the server change max-age from 120 seconds to 240.

So, the big question is, what would happen after 120 seconds? Would Firefox respect the new max-age and continue loading the page from cache, or would it hit the server? The answer is that it continued loading the page from cache, and did not re-request until after 240 seconds were reached:

GET /example HTTP/1.1
If-Modified-Since: Mon, 13 Jun 2011 22:20:03 GMT
If-None-Match: "901ea3d0ac9303ae4855a09676f96701"

HTTP/1.1 304 Not Modified
Date: Tue, 14 Jun 2011 23:54:56 GMT
Etag: "901ea3d0ac9303ae4855a09676f96701"
Cache-Control: max-age=240

I repeated through another 240-second cycle and things worked as you'd expect. So, hopefully that answers the question for you.

The RFC explains how age computations are supposed to be implemented, and how the other Cache-Control parameters work. There's no guarantee that every browser and proxy will follow the rules, but at this point HTTP 1.1 is pretty old and you'd expect most of them will do as Firefox does.

(Note: For brevity in these example dumps, I've deleted irrelevant headers such as host, connection/keep-alive, content encoding/length/type, user-agent etc.)

like image 121
joelhardi Avatar answered Sep 22 '22 12:09

joelhardi