Always serve content from a CDN EDGE cache, regardless of how stale. Refresh it in the background when possible.
I have a NextJS app that renders some React components server-side and delivers them to the client. For this discussion, let's just consider my homepage, which is unauthenticated and the same for everyone.
What I'd like is for the server rendered homepage to be cached at a CDN's EDGE nodes and served to end clients from that cache as often as possible, or always.
From what I've read, CDNs (like Fastly) which properly support cache related header settings like Surrogate-Control
and Cache-Control: stale-while-revalidate
should be able to do this, but in practice, I'm not seeing this working like I'd expect. I'm seeing either:
Consider the following timeline:
[T0] - Visitor1 requests www.mysite.com
- The CDN cache is completely cold, so the request must go back to my origin (AWS Lambda) and recompute the homepage. A response is returned with the headers Surrogate-Control: max-age=100
and Cache-Control: public, no-store, must-revalidate
.
Visitor1 then is served the homepage, but they had to wait a whopping 5 seconds! YUCK! May no other visitor ever have to suffer the same fate.
[T50] - Visitor2 requests www.mysite.com
- The CDN cache contains my document and returns it to the visitor immediately. They only had to wait 40ms! Awesome. In the background, the CDN refetches the latest version of the homepage from my origin. Turns out it hasn't changed.
[T80] - www.mysite.com
publishes new content to the homepage, making any cached content truly stale. V2 of the site is now live!
[T110] - Visitor1 returns to www.mysite.com
- From the CDNs perspective, it's only been 60s since Visitor2's request, which means the background refresh initiated by Visitor2 should have resulted in a <100s stale copy of the homepage in the cache (albeit V1, not V2, of the homepage). Visitor1 is served the 60s stale V1 homepage from cache. A much better experience for Visitor1 this time!
This request initiates a background refresh of the stale content in the CDN cache, and the origin this time returns V2 of the website (which was published 30s ago).
[T160] - Visitor3 visits www.mysite.com
- Despite being a new visitor, the CDN cache is now fresh from Visitor1's most recent trigger of a background refresh. Visitor3 is served a cached V2 homepage.
...
As long as at least 1 visitor comes to my site every 100s (because max-age=100
), no visitor will ever suffer the wait time of a full roundtrip to my origin.
1. Is this a reasonable ask of a modern CDN? I can't imagine this is more taxing than always returning to the origin (no CDN cache), but I've struggled to find documentation from any CDN provider about the right way to do this. I'm working with Fastly now, but am willing to try any others as well (I tried Cloudflare first, but read that they don't support stale-while-revalidate
)
2. What are the right headers to do this with? (assuming the CDN provider supports them)
I've played around with both Surrogate-Control: maxage=<X>
and Cache-Control: public, s-maxage=<X>, stale-while-revalidate
in Fastly and Cloudflare, but none seem to do this correctly (requests well within the maxage timeframe dont pickup changes on the origin until there is a cache miss).
3. If this isn't supported, are there API calls that could allow me to PUSH content updates to my CDN's cache layer, effectively saying "Hey I just published new content for this cache key. Here it is!"
I could use a Cloudflare worker to implement this kinda caching myself using their KV store, but I thought I'd do a little more research before implementing a code solution to a problem that seems to be pretty common.
Thanks in advance!
A Content Delivery Network (CDN) is a critical component of nearly any modern web application. It used to be that CDN merely improved the delivery of content by replicating commonly requested files (static content) across a globally distributed set of caching servers.
What is CDN caching? A CDN, or content delivery network, caches content (such as images, videos, or webpages) in proxy servers that are located closer to end users than origin servers. (A proxy server is a server that receives requests from clients and passes them along to other servers.)
Cached content on CDN nodes is not updated in real time. CDN nodes only retrieve new content from the origin server when the previously cached content expires. If you want to update content cached on CDN nodes, configure cache rules or submit cache refreshing or cache preheating tasks.
Cache expiration duration For the Override and Set if missing Caching behavior settings, valid cache durations range between 0 seconds and 366 days. For a value of 0 seconds, the CDN caches the content, but must revalidate each request with the origin server.
I've been deploying a similar application recently. I ended up running a customised nginx instance in front of the Next.js server.
Cache-Control
headers to the client. You could tweak this config to use the values in Cache-Control
from Next.js, and then drop that header before responding to the client if the MIME type is text/html
or application/json
.This isn't perfect, but it handles the important stale-while-revalidate behaviour. You could run a CDN over this as well if you want the benefit of global propagation.
Warning: This hasn't been extensively tested. I'm not confident that all the behaviour around error pages and response codes is right.
# Available in NGINX Plus
# map $request_method $request_method_is_purge {
# PURGE 1;
# default 0;
# }
proxy_cache_path
/nginx/cache
inactive=30d
max_size=800m
keys_zone=cache_zone:10m;
server {
listen 80 default_server;
listen [::]:80 default_server;
# Basic
root /nginx;
index index.html;
try_files $uri $uri/ =404;
access_log off;
log_not_found off;
# Redirect server error pages to the static page /error.html
error_page 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 500 501 502 503 504 505 /error.html;
# Catch error page route to prevent it being proxied.
location /error.html {}
location / {
# Let the backend server know the frontend hostname, client IP, and
# client–edge protocol.
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
# This header is a standardised replacement for the above two. This line
# naively ignores any `Forwarded` header passed from the client (which could
# be another proxy), and instead creates a new value equivalent to the two
# above.
proxy_set_header Forwarded "for=$remote_addr;proto=$scheme";
# Use HTTP 1.1, as 1.0 is default
proxy_http_version 1.1;
# Available in NGINX Plus
# proxy_cache_purge $request_method_is_purge;
# Enable stale-while-revalidate and stale-if-error caching
proxy_cache_background_update on;
proxy_cache cache_zone;
proxy_cache_lock on;
proxy_cache_lock_age 30s;
proxy_cache_lock_timeout 30s;
proxy_cache_use_stale
error
timeout
invalid_header
updating
http_500
http_502
http_503
http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control Vary;
proxy_cache_valid 10m;
# Prevent 502 error
proxy_buffers 8 32k;
proxy_buffer_size 64k;
proxy_read_timeout 3600;
proxy_pass "https://example.com";
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With