Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confused with minimum, maximum and default TTL in cloudFront

I have my web app in S3 and serving the app using cloudFront's Web distribution. I gave the official documentation a read, but confused with lot of terminologies.

My questions:

  1. I want to set the cloudFront cache to maximum of 1 year(365 days). To do so, what do I have to do? (Do we have to set a header for each objects in S3?)

I came across the header cache-control, and found that if the cloudFront returns such a header with a value, then the browsers capable of caching will cache the objects for the given value.

  1. How to set the cache-control header in cloudFront so that the objects are cached in user's browser?

  2. Is there any tools to check the S3 & cloudFront deployment, namely the headers returned?

So, that it will be easy to debug with respect to the cache headers.

Update After @Udo answer. This is the screenshot of my request and response headers.

headers

like image 711
Lakshman Diwaakar Avatar asked Apr 11 '17 10:04

Lakshman Diwaakar


1 Answers

CloudFront does not add the Cache-Control header. It passes it through to the browser if the origin server supplies it.

If you didn't set a Cache-Control header when you uploaded your objects to S3 then you will need to upload your objects again or go into the S3 console and add the header to the objects, with a value of max-age=31536000 if you want browsers to cache the object for up to a year.

If you configure CloudFront to "use origin cache headers," then CloudFront will use the max-age value from Cache-Control to determine how long the object can be cached at CloudFront, unless the s-maxage value is also there, in which case, CloudFront will use that instead.

If you configure the min/max/default, CloudFront will use these counters to determine how long objects can be cached:

  • minimum: objects may be cached for at least this long, even if Cache-Control: max-age (or s-maxage if present) has a lower value. In fact, setting Minimum TTL to a nonzero value causes CloudFront to disregard the Cache-Control directives no-cache, no-store, and private, and cache them for up to the value Minimum TTL -- useful in cases where you want browsers to see these values but you still want CloudFront to cache the objects.
  • maximum: objects will not be cached any longer than this, even if Cache-Control: max-age has a higher value.
  • default: objects may be cached for this long, if Cache-Control is not seen on the object. You should not need this, because you should have Cache-Control headers everywhere.

Important things to note about these settings:

  • They only impact the CloudFront cache, not the browser cache.
  • Once an object is cached by CloudFront, it has no way of knowing that the object has been changed in S3. It may not check again until the timer expires.
  • There is little point in trying to "force" CloudFront to retain an object in cache for a long time by setting excessively long times (such as a year) in CloudFront, because CloudFront can purge any object from its cache at any time for any reason -- caches are volatile, by nature. The popularity of an object (or lack of popularity) may trigger CloudFront to purge it before the timer expires. On the next request, it will be fetched from the origin by CloudFront.

Also important, CloudFront has two geographically organized layers of caches -- regional (inner) and edge (outer). The edge caches are more numerous and geographically distributed, but the regional caches have larger storage capacity. If you fetch an object through CloudFront, CloudFront will cache that object somewhere (either at one regional cache or one edge cache or at one of each), but the next request -- perhaps from a browser in a different geographic area -- may pass through an edge and a region through which the object has never been requested before. On the other hand, it might be requested through an edge that doesn't have it, but it will be fetched from the regional cache. Try to keep this in mind as you understand what it means to say that any given object at any given time cannot correctly be said to be either in the cache or not in the cache because there is no "the" cache. There are multiple caches around the world, many of which do not communicate with each other because that would make things slower, not faster. If your web site is popular in Australia but not in England, there may be copies of your objects cached in Asia Pacific cache locations but not in Western Europe cache locations. This behavior is all automatic, and is not something you configure, but you need to be aware that CloudFront doesn't have a single, monolithic cache. Objects are cached in places where they are being accessed.

Is there any tools to check the S3 & cloudFront deployment, namely the headers returned

Your eyeballs are the best tool. The response headers in the browser tell you what you need to know:

Age: is how long ago (in seconds) CloudFront has had this object in its cache.

X-Cache: Hit from cloudfront means CloudFront did not have to fetch the object from S3, because it was already cached. Miss from cloudfront means CloudFront did not have the object in its cache at the edge handling this request, and needed to fetch it from S3.

The command line utility curl, along with its -v option is also useful for observing web headers.

like image 63
Michael - sqlbot Avatar answered Nov 02 '22 23:11

Michael - sqlbot