Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ASP.NET MVC and IE caching - manipulating response headers ineffective

Tags:

Background

I'm attempting to help a colleague debug an issue that hasn't been an issue for the past 6 months. After the most recent deployment of an ASP.NET MVC 2 application, FileResult responses that force a PDF file at the user for opening or saving are having trouble existing long enough on the client machine for the PDF reader to open them.

Earlier versions of IE (expecially 6) are the only browsers affected. Firefox and Chrome and newer versions of IE (>8) all behave as expected. With that in mind, the next section defines the actions necessary to recreate the issue.

Behavior

  1. User clicks a link that points to an action method (a plain hyperlink with an href attribute).
  2. The action method generates a PDF represented as a byte stream. The method always recreates the PDF.
  3. In the action method, headers are set to instruct browsers how to cache the response. They are:

    response.AddHeader("Cache-Control", "public, must-revalidate, post-check=0, pre-check=0"); response.AddHeader("Pragma", "no-cache"); response.AddHeader("Expires", "0"); 

    For those unfamiliar with exactly what the headers do:

    a. Cache-Control: public

    Indicates that the response MAY be cached by any cache, even if it would normally be non-cacheable or cacheable only within a non- shared cache.

    b. Cache-Control: must-revalidate

    When the must-revalidate directive is present in a response received by a cache, that cache MUST NOT use the entry after it becomes stale to respond to a subsequent request without first revalidating it with the origin server

    c. Cache-Control: pre-check (introduced with IE5)

    Defines an interval in seconds after which an entity must be checked for freshness. The check may happen after the user is shown the resource but ensures that on the next roundtrip the cached copy will be up-to-date.

    d. Cache-Control: post-check (introduced with IE5)

    Defines an interval in seconds after which an entity must be checked for freshness prior to showing the user the resource.

    e. Pragma: no-cache (to ensure backwards compatibility with HTTP/1.0)

    When the no-cache directive is present in a request message, an application SHOULD forward the request toward the origin server even if it has a cached copy of what is being requested

    f. Expires

    The Expires entity-header field gives the date/time after which the response is considered stale.

  4. We return the file from the action

    return File(file, "mime/type", fileName); 
  5. The user is presented with an Open/Save dialog box

  6. Clicking "Save" works as expected, but clicking "Open" launches the PDF reader, but the temporary file IE stored has already been deleted by the time the reader tries to open the file, so it complains that the file is missing (and it is).

There are a half dozen other apps here that use the same headers to force Excel, CSV, PDF, Word, and a ton of other content at users and there's never been an issue.

The Question

  • Are the headers correct for what we're trying to do? We want the file to exist temporarily (get cached), but always be replaced by new versions even though the requests may be identical).

The response headers are set in the action method before return a FileResult. I've asked my colleague to try creating a new class that inherits from FileResult and to instead override the ExecuteResult method so that it modifies the headers and then does base.ExecuteResult() instead -- no status on that.

I have a hunch the "Expires" header of "0" is the culprit. According to this W3C article, setting it to "0" implies "already expired." I do want it to be expired, I just don't want IE to go removing it off of the filesystem before the application handling it gets a chance to open it.

As always, thanks!

Edit: The Solution

Upon further testing (using Fiddler to inspect the headers), we were seeing that the response headers we thought were getting set were not the ones being interpreted by the browser. Having not been familiar with the code myself, I was unaware of an underlying issue: the headers were getting stomped on outside of the action method.

Nonetheless, I'm going to leave this question open. Still outstanding is this: there seems to be some discrepancy between the Expires header having a value of 0 vs. -1. If anybody can lay claim to differences by design, in regards to IE, I would still like to hear about it. As for a solution though, the above headers do work as intended with the Expires value set to -1 in all browsers.

Update 1

The post How to control web page caching, across all browsers? describes in detail that caching can be prevented in all browsers with the help of setting Expires = 0. I'm still not sold on this 0 vs -1 argument...

like image 297
Cᴏʀʏ Avatar asked Feb 10 '12 19:02

Cᴏʀʏ


1 Answers

I think you should just use

HttpContext.Current.Response.Cache.SetMaxAge (new TimeSpan (0)); 

or

HttpContext.Current.Response.Headers.Set ("Cache-Control", "private, max-age=0"); 

to set max-age=0 which means nothing more as the cache re-validating (see here). If you would be set additionally ETag in the header with some your custom checksum of hash from the data, the ETag from the previous request will be sent to the server. The server are able either to return the data or, in case that the data are exactly the same as before, it can return empty body and HttpStatusCode.NotModified as the status code. In the case the web browser will get the data from the local browser cache.

I recommend you to use Cache-Control: private which force two important things: 1) switch off caching the data on the proxy, which has sometimes very aggressive caching settings 2) it will allows the caching of the the data, but not permit sharing of the cache with another users. It can solve privacy problems because the data which you return to one user could be not allowed to read by another users. By the way the code HttpContext.Current.Response.Cache.SetMaxAge (new TimeSpan (0)) set Cache-Control: private, max-age=0 in the HTTP header by default. If you do want to use Cache-Control: public you can use SetCacheability (HttpCacheability.Public); to overwrite the behavior or use Headers.Set instead of Cache.SetMaxAge.

If you have interest to study more caching options of HTTP protocol I would recommend you to read the caching tutorial.

UPDATED: I decide to write some more information to clear my position. Corresponds to the information from the Wikipedia even so old web browsers like Mosaic 2.7, Netscape 2.0 and Internet Explorer 3.0 supports March 1996, pre-standard of HTTP/1.1 described in RFC 2068. So I suppose (but not test it) that the old web browsers support max-age=0 HTTP header. In any way Netscape 2.06 and Internet Explorer 4.0 definitively supports HTTP 1.1.

So you should ask you first: which HTML standards you use? Do you still use HTML 2.0 instead of more late HTML 3.2 published in January 1997? I suppose you use at least HTML 4.0 published in December 1997. So if you build your application at least in HTML 4.0, your site can be oriented on the web clients which supports HTTP 1.1 and ignore (don't support) the web clients which don't support HTTP 1.1.

Now about other "Cache-Control" headers as "private, max-age=0". Including of the headers is in my opinion is pure paranoia. As I have some caching problem myself I tried also to include different other headers, but later after reading carefully the section 14.9 of RFC2616 I use only "Cache-Control: private, max-age=0".

The only "Cache-Control" header which can be additionally discussed is "must-revalidate" described on the section 14.9.4 which I referenced before. Here is the quote:

The must-revalidate directive is necessary to support reliable operation for certain protocol features. In all circumstances an HTTP/1.1 cache MUST obey the must-revalidate directive; in particular, if the cache cannot reach the origin server for any reason, it MUST generate a 504 (Gateway Timeout) response.

Servers SHOULD send the must-revalidate directive if and only if failure to revalidate a request on the entity could result in incorrect operation, such as a silently unexecuted financial transaction. Recipients MUST NOT take any automated action that violates this directive, and MUST NOT automatically provide an unvalidated copy of the entity if revalidation fails.

Although this is not recommended, user agents operating under severe connectivity constraints MAY violate this directive but, if so, MUST explicitly warn the user that an unvalidated response has been provided. The warning MUST be provided on each unvalidated access, and SHOULD require explicit user confirmation.

Sometime if I have problem with Internet connection I see the empty page with "Gateway Timeout" message. It come from the the usage of "must-revalidate" directive. I don't think that "Gateway Timeout" message really help the user.

So the persons, how prefer to start self-destructive procedure if he hears "Busy" signal on the call to his boss, should additionally use "must-revalidate" directive in the "Cache-Control" header. Other persons I recommend just use "Cache-Control: private, max-age=0" and nothing more.

like image 95
Oleg Avatar answered Sep 28 '22 05:09

Oleg