CDN support/configuration for serving "stale" content, refreshing in background

Goal

Always serve content from a CDN EDGE cache, regardless of how stale. Refresh it in the background when possible.

Problem

I have a NextJS app that renders some React components server-side and delivers them to the client. For this discussion, let's just consider my homepage, which is unauthenticated and the same for everyone.

What I'd like is for the server rendered homepage to be cached at a CDN's EDGE nodes and served to end clients from that cache as often as possible, or always.

From what I've read, CDNs (like Fastly) which properly support cache related header settings like Surrogate-Control and Cache-Control: stale-while-revalidate should be able to do this, but in practice, I'm not seeing this working like I'd expect. I'm seeing either:

requests miss the cache and return to the origin when a prior request should have warmed it
requests are served from cache, but never get updated when the origin publishes new content

Example

Consider the following timeline:

[T0] - Visitor1 requests www.mysite.com - The CDN cache is completely cold, so the request must go back to my origin (AWS Lambda) and recompute the homepage. A response is returned with the headers Surrogate-Control: max-age=100 and Cache-Control: public, no-store, must-revalidate. Visitor1 then is served the homepage, but they had to wait a whopping 5 seconds! YUCK! May no other visitor ever have to suffer the same fate.

[T50] - Visitor2 requests www.mysite.com - The CDN cache contains my document and returns it to the visitor immediately. They only had to wait 40ms! Awesome. In the background, the CDN refetches the latest version of the homepage from my origin. Turns out it hasn't changed.

[T80] - www.mysite.com publishes new content to the homepage, making any cached content truly stale. V2 of the site is now live!

[T110] - Visitor1 returns to www.mysite.com - From the CDNs perspective, it's only been 60s since Visitor2's request, which means the background refresh initiated by Visitor2 should have resulted in a <100s stale copy of the homepage in the cache (albeit V1, not V2, of the homepage). Visitor1 is served the 60s stale V1 homepage from cache. A much better experience for Visitor1 this time! This request initiates a background refresh of the stale content in the CDN cache, and the origin this time returns V2 of the website (which was published 30s ago).

[T160] - Visitor3 visits www.mysite.com - Despite being a new visitor, the CDN cache is now fresh from Visitor1's most recent trigger of a background refresh. Visitor3 is served a cached V2 homepage.

...

As long as at least 1 visitor comes to my site every 100s (because max-age=100), no visitor will ever suffer the wait time of a full roundtrip to my origin.

Questions

1. Is this a reasonable ask of a modern CDN? I can't imagine this is more taxing than always returning to the origin (no CDN cache), but I've struggled to find documentation from any CDN provider about the right way to do this. I'm working with Fastly now, but am willing to try any others as well (I tried Cloudflare first, but read that they don't support stale-while-revalidate)

2. What are the right headers to do this with? (assuming the CDN provider supports them) I've played around with both Surrogate-Control: maxage=<X> and Cache-Control: public, s-maxage=<X>, stale-while-revalidate in Fastly and Cloudflare, but none seem to do this correctly (requests well within the maxage timeframe dont pickup changes on the origin until there is a cache miss).

3. If this isn't supported, are there API calls that could allow me to PUSH content updates to my CDN's cache layer, effectively saying "Hey I just published new content for this cache key. Here it is!"

I could use a Cloudflare worker to implement this kinda caching myself using their KV store, but I thought I'd do a little more research before implementing a code solution to a problem that seems to be pretty common.

Thanks in advance!

592

asked Jan 19 '20 18:01

jamis0n

1 Answers

I've been deploying a similar application recently. I ended up running a customised nginx instance in front of the Next.js server.

Ignore cache headers from the upstream server.
- I wanted to cache markup and JSON, but I didn't want to send Cache-Control headers to the client. You could tweak this config to use the values in Cache-Control from Next.js, and then drop that header before responding to the client if the MIME type is text/html or application/json.
Consider all responses valid for 10 minutes.
Remove cached responses after 30 days.
Use up to 800 MB for the cache.
After serving a stale response, attempt to fetch a new response from the upstream server.

This isn't perfect, but it handles the important stale-while-revalidate behaviour. You could run a CDN over this as well if you want the benefit of global propagation.

Warning: This hasn't been extensively tested. I'm not confident that all the behaviour around error pages and response codes is right.

# Available in NGINX Plus
# map $request_method $request_method_is_purge {
#   PURGE   1;
#   default 0;
# }

proxy_cache_path
  /nginx/cache
  inactive=30d
  max_size=800m
  keys_zone=cache_zone:10m;

server {
  listen 80 default_server;
  listen [::]:80 default_server;

  # Basic
  root /nginx;
  index index.html;
  try_files $uri $uri/ =404;

  access_log off;
  log_not_found off;

  # Redirect server error pages to the static page /error.html
  error_page 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 500 501 502 503 504 505 /error.html;

  # Catch error page route to prevent it being proxied.
  location /error.html {}

  location / {
    # Let the backend server know the frontend hostname, client IP, and
    # client–edge protocol.
    proxy_set_header X-Forwarded-For $remote_addr;
    proxy_set_header X-Forwarded-Proto $scheme;
    # This header is a standardised replacement for the above two. This line
    # naively ignores any `Forwarded` header passed from the client (which could
    # be another proxy), and instead creates a new value equivalent to the two
    # above.
    proxy_set_header Forwarded "for=$remote_addr;proto=$scheme";

    # Use HTTP 1.1, as 1.0 is default
    proxy_http_version 1.1;

    # Available in NGINX Plus
    # proxy_cache_purge $request_method_is_purge;

    # Enable stale-while-revalidate and stale-if-error caching
    proxy_cache_background_update on;
    proxy_cache cache_zone;
    proxy_cache_lock on;
    proxy_cache_lock_age 30s;
    proxy_cache_lock_timeout 30s;

    proxy_cache_use_stale
      error
      timeout
      invalid_header
      updating
      http_500
      http_502
      http_503
      http_504;

    proxy_ignore_headers X-Accel-Expires Expires Cache-Control Vary;
    proxy_cache_valid 10m;

    # Prevent 502 error
    proxy_buffers 8 32k;
    proxy_buffer_size 64k;
    proxy_read_timeout 3600;

    proxy_pass "https://example.com";
  }
}

192

answered Oct 20 '22 08:10

Blieque

Related questions
                            
                                Load cdn file and save it in Service Worker
                            
                                CDN and URL's with query-strings
                            
                                How do I stop Angular from loading the .angular.min.js.map file when pulling Angular from a CDN?
                            
                                Add expiry headers using Apache for paths which don't exist in the filesystem
                            
                                Less.js does not work with CDNs?
                            
                                Is Google's jquery CDN script link missing thew leading "https:" for a reason?
                            
                                Why does the CDN's have 2 // instead of http or https in front of the URL
                            
                                asp.net mvc 5 - StyleBundle CdnFallbackExpression questions?
                            
                                Which CDN does Netflix and Hulu use and how can they offer unlimited streaming for $10 per month? [closed]
                            
                                Which jQuery version to reference for CDN?
                            
                                Does CDN know which website the client is visiting when fetching jquery.min.js or other resource from CDN?
                            
                                How to see what is returned when a remote script is blocked
                            
                                Using a CDN to store/serve user image uploads?
                            
                                Why randomize your file names for cloud storage/CDN?
                            
                                What settings are required to put AWS CloudFront CDN in front of a squarespace website?
                            
                                Should an intranet web application utilize a CDN
                            
                                Will an Nginx as reverse proxy for Apache help on dynamic content only
                            
                                Find Mime type of file or url using php for all file format
                            
                                Cloudflare or Incapsula CDN without changing DNS
                            
                                Do it yourself or use a commercial CDN

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

CDN support/configuration for serving "stale" content, refreshing in background

Tags:

cdn

cache-control

next.js

cloudflare

fastly