Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where to store and cache JSON?

Tags:

json

heroku

Have been considering caching my JSON on Amazon Cloudfront.

The issue with that is it can take 15 minutes to manually clear that cache when the JSON is updated.

Is there a way to store a simple JSON value in a CDN-like http cache that -

  • does not touch an application server (heroku) after intial generation
  • allows me to instantly expire a cache

Update

In response to AdamKG's point:

If it's being "updated", it's not static :D Write a new version and tell your servers to use the new URL.

My actual idea is to cache a new CloudFront url every time a html page changes. That was my original focus.

The reason I want to JSON is to store the version number for that latest CloudFront url. That way I can make an AJAX call to discover what version to load, then a second AJAX call to actually load the content. This way I never need to expire CloudFront content, I just redirect the ajax loading it.

But then I have the issue of the JSON needing to be cached. I don't want people hitting the Heroku dynamos every time they want to see the single JSON version number. I know memcache and rack can help me speed that up, but it's a problem I just dont want to have.

Some ideas I've had:

  • Maybe there is a third party service, similar to a Memcache db, that allows me to be expose a value in a JSON url? That way my dynamos are never touched.
  • Maybe there is an alternative to Cloudfront that allows for quicker manual expiration? I know that kinda defeats the nature of caching, but maybe there more intermediary services, like a varnish layer or something.
like image 449
user537339 Avatar asked Oct 03 '22 02:10

user537339


1 Answers

One method is to use asset expiration similar to the way that Rails static assets are expired. Rails adds a hash signature to filenames, so something like application.js becomes application-abcdef1234567890.js. Then, each time a user requests your page, if application.js has been updated, the script tag has the new address.

Here is how I envision you doing this:

User → CloudFront (CDN) → Your App (Origin)

User requests http://www.example.com/. The page has meta tag

<meta content="1231231230" name="data-timestamp" />

based on the last time you updated the JSON resource. This could be generated from something like <%= Widget.order(updated_at: :desc).pluck(:updated_at).first.to_i %> if you are using Rails.

Then, in your application's JavaScript, grab the timestamp and use it for your JSON url.

var timestamp = $('meta[name=data-timestamp]').attr('content');
$.get('http://cdn.example.com/data-' + timestamp + '.json', function(data, textStatus, jqXHR) {
  blah(data);
});

The first request to CloudFront will hit your origin server at /data/data-1231231230.json, which can be generated and cached forever. Each time your JSON should be updated, the user gets a new URL to query the CDN.

Update

Since you mention that the actual page is what you want to cache heavily, you are left with a couple options. If you really want CloudFront in front of your server, your only real option would be to send an invalidation request every time your homepage updates. You can invalidate 1,000 times per month for free, and $5 per 1,000 after that. In addition, CloudFront invalidations are not fast, and you will still have a delay before the page is updated.

The other option is to cache your content in Memcached and serve it from your dynos. I will assume that you are using Ruby on Rails or another Ruby framework based on your asking history (but please clarify if you are not). This entails getting Rack::Cache installed. The instructions on Heroku are for caching assets, but this will work for dynamic content, as well. Next, you would use Rack::Cache's invalidate method each time the page is updated. Yes, your dyno's will handle some of the load, but it will be a simple Memcached lookup and response.

Your server layout would look like:

User → CloudFront (CDN) → Rack::Cache → Your App (Origin) on cdn.example.com User → Rack::Cache → Your App (Origin) on www.example.com

When you serve static assets like your images, CSS, and JavaScript, use the cdn.example.com domain. This will route requests through CloudFront and they will be cached for long periods of time. Requests to your app will go directly to your Heroku dyno, and the cacheable parts will be stored and retrieved by Rack::Cache.

like image 200
Benjamin Manns Avatar answered Oct 08 '22 02:10

Benjamin Manns