I can't seem to see an obvious answer in the documentation.
When I update a file on S3 and I have CloudFront enabled, does S3 send an invalidation signal to CloudFront? Or do I need to send it myself after updating the file?
S3 doesn't send any invalidation information to CloudFront. By default CloudFront will hold information up to the maximum time specified by the Cache Control headers that were set when it retrieved the data from the origin (it may remove items from its cache earlier if it feels like it).
You can invalidate cache entries by creating an invalidation batch. This will cost you money: the 1st 1000 requests a month are free but beyond that it costs $0.005 per request - if you were invalidating 1000 files a day it would cost you $150 a month (unless you can make use of the wildcard feature). You can of course trigger this in response to an s3 event using an Amazon Lambda function.
Another approach is to use a different path when the object changes (in effect a generational cache key). Similarly you could append a query parameter to the url and change that query parameter when you want cloudfront to fetch a fresh copy (to do this you'll need to tell CloudFront to use query string parameters - by default it ignores them).
Another way if you only do infrequent (but large) changes is to simply create a new cloudfront distribution.
As far as I know, all CDNs work like this.
It's why you generally use something like foo-x.y.z.ext
to version assets on a CDN. I wouldn't use foo.ext?x.y.z
because there was something about certain browsers and proxies never caching assets with a ?QUERY_STRING
.
In general you may want to check this out: https://developers.google.com/speed/docs/best-practices/caching
It contains lots of best practices and goes into details what to do and how it works.
In regard to S3 and Cloudfront, I'm not super familiar with the cache invalidation, but what Frederick Cheung mentioned is all correct.
Some providers also allow you to clear the cache directly but because of the nature of a CDN these changes are almost never instant. Another method is to set a smaller TTL (expiration headers) so assets will be refreshed more often. But I think that defeats the purpose of a CDN as well.
In our case (Edgecast), cache invalidation is possible (a manual process) and free of charge, but we rarely do this because we version our assets accordingly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With