Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rails Low-Level Caching: Update cache when ActiveRecord object updated_at changes OR when a new object is added to collection

Rails ships with Fragment Caching and Low-Level Caching. It is pretty clear how fragment caching works:

Rails will write a new cache entry with a unique key. If the value of updated_at has changed, a new key will be generated. Then Rails will write a new cache to that key, and the old cache written to the old key will never be used again. This is called key-based expiration. Cache fragments will also be expired when the view fragment changes (e.g., the HTML in the view changes). Cache stores like Memcached will automatically delete old cache files.

<% @products.each do |product| %>
  <% cache product do %>
   <%= render product %>
  <% end %>
<% end %>

So the view will be cached and when the updated_at of the ActiveRecord object associated with the view changes or the html changes, then new cache is created. Understandable.

I don't want to cache a view. I want to cache a Hash collection built from an ActiveRecord query. And I know Rails has SQL Caching, where it caches the result of the same exact query when used in one request. But I need results available among multiple requests and only updated when updated_at changes for one of objects or a new object is added to the Hash collection.

Low-Level Caching caches a particular value or query result instead of caching view fragments.

def get_events
  @events = Event.search(params)

  time = Benchmark.measure {
    event_data = Rails.cache.fetch 'event_data' do          
        # A TON OF EVENTS TO LOAD ON CALENDAR
        @events.collect do |event|
           {
                    title: event.title,
                    description: event.description || '',
                    start: event.starttime.iso8601,
                    end: event.endtime.iso8601,
                    allDay: event.all_day,
                    recurring: (event.event_series_id) ? true : false,
                    backgroundColor: (event.event_category.color || "red"),
                    borderColor: (event.event_category.color || "red")  
           }
        end
    end

  }
  Rails.logger.info("CALENDAR EVENT LOAD TIME: #{time.real}")

  render json: event_data.to_json
end

But right now I do not think the cache expires if one of the events is updated or a new hash is added to the collection. How can I do this?

like image 274
Daniel Viglione Avatar asked Apr 06 '18 19:04

Daniel Viglione


People also ask

Does ActiveRecord cache?

ActiveRecord makes accessing your database easy, but it can also help make it faster by its intelligent use of caching.

How does Rails query cache work?

Query caching is a Rails feature that caches the result set returned by each query. If Rails encounters the same query again for that request, it will use the cached result set as opposed to running the query against the database again.

What is low level caching?

What is Low-Level Caching. What Rails calls low level caching is really just reading and writing data to a key-value store. Out of the box, Rails supports an in-memory store, files on the filesystem, and external stores like Redis or memcached. It is called "low level" caching because you are dealing with the Rails.

How use Redis caching in Rails?

To use Redis as a Rails cache store, use a dedicated cache instance that's set up as an LRU (Last Recently Used) cache instead of pointing the store at your existing Redis server, to make sure entries are dropped from the store when it reaches its maximum size.


1 Answers

Rails.cache.fetch is a shortcut to read and write

What we're trying to do with the cache is this:

value = Rails.cache.read( :some_key )
if value.nil?
  expected_value = ...
  Rails.cache.write( :some_key, expected_value )
end

We first try to read from the cache, and if no value exist, then we retrieve the data from wherever it is, add it to the cache and do whatever you need to do with it.

On next calls, we'll try to access again the :some_key cache key, but this time it will exist and we won't need to retrieve the value again, we just need to take the value that's already cached.

This is what fetch does all at once:

value = Rails.cache.fetch( :some_key ) do
  # if :some_key doesn't exist in cache, the result of this block
  # is stored into :some_key and gets returned
end

It's no more than a very handy shortcut, but it's important to understand its behaviour.

How do we deal with our low-level caching?

The key here (pun intended) is to chose a cache key that changes when the cached data is not up-to-date with the underlying data. It's generally easier than updating the existing cache values: instead, you make sure not to reuse a cache key where old data has been stored.

For instance, to store all events data in the cache, we'd do something like:

def get_events
  # Get the most recent event, this should be a pretty fast query
  last_modified = Event.order(:updated_at).last
  # Turns a 2018-01-01 01:23:45 datetime into 20180101220000
  # We could use to_i or anything else but examples would be less readable
  last_modified_str = last_modified.updated_at.utc.to_s(:number) 
  # And our cache key would be
  cache_key = "all_events/#{last_modified_str}"

  # Let's check this cache key: if it doesn't exist in our cache store, 
  # the block will store all events at this cache key, and return the value
  all_events = Rails.cache.fetch(cache_key) do 
    Event.all
  end

  # do whatever we need to with all_events variable
end

What's really important here:

  • The main data loading happens inside the fetch block. It must not be triggered every time you get in this method, or you lose all the interest of caching.
  • The choice of the key is paramount! It must :
    • Change as soon as your data gets stale. Otherwise, you'd hit an "old" cache key with old data, and wouldn't be serving the latest data.
    • BUT! determining the cache key must cost a lot less than retrieving the cached data, since you'll be "calculating" what the cache key is every single time you pass into this method. So if determining the cache key takes longer than retrieving the actual data, well, maybe you'll have to think things differently.

An example with this previous method

Let's check how my previous get_events method behaves with an example. Say we have the following Events in your DB:

| ID | Updated_at       | 
|----|------------------|
|  1 | 2018-01-01 00:00 | 
|  2 | 2018-02-02 00:00 | 
|  3 | 2018-03-03 00:00 |

First call

At this point, let's call get_events. Event #3 is the most recently updated one, so Rails will check the cache key all_events/20180303000000. It does not exist yet, so all events will be requested from the DB and stored into the cache with this cache key.

Same data, subsequent calls

If you don't update any of those events, all next calls to get_events will be hitting the cache key all_events/20180303000000 which now exists and contains all events. Therefore, you won't hit the DB, and just use the value from the cache.

What if we modify an Event?

Event.find(2).touch

We've modify event #2, so what's previoulsy been stored in the cache is not up to date anymore. We now have the following events list:

| ID | Updated_at       | 
|----|------------------|
|  1 | 2018-01-01 00:00 | 
|  2 | 2018-04-07 19:27 | <--- just updated :) 
|  3 | 2018-03-03 00:00 |

Next call to get_events will take the most recent event (#2 now), and therefore try to access the cache key all_events/20180407192700... which does not exist yet! Rails.cache.fetch will evaluated the block, and put all the current events, in their current state, into this new key all_events/20180407192700. And you don't get served stale data.


What about your particular issue?

You'll have to find the proper cache key, and make it so that event data loading is done inside the fetch block.

Since you filter your events with params, the cache will be depending on your params, so you'll need to find a way to represent the params as a string to integrate this to your cache key. Cached events will differ from one set of params to another.

Find the most recently updated event for those params to avoid retrieving stale data. We can use ActiveRecord cache_key method on any ActiveRecord object as its cache key, which is handy and avoid tedious timestamp formatting like we did previously.

This should give you something like:

def get_events
  latest_event = Event.search(params).order(:updated_at).last
  # text representation of the given params

  # Check https://apidock.com/rails/ActiveRecord/Base/cache_key
  # for an easy way to define cache_key for an ActiveRecord model instance
  cache_key = "filtered_events/#{text_rep_of_params}/#{latest_event.cache_key}"

  event_data = Rails.cache.fetch(cache_key) do
    events = Event.search(params)
    # A TON OF EVENTS TO LOAD ON CALENDAR
    events.collect do |event|
      {
        title: event.title,
        description: event.description || '',
        start: event.starttime.iso8601,
        end: event.endtime.iso8601,
        allDay: event.all_day,
        recurring: (event.event_series_id) ? true : false,
        backgroundColor: (event.event_category.color || "red"),
        borderColor: (event.event_category.color || "red")  
      }
    end
  end

  render json: event_data.to_json
end

Voilà! I hope it helped. Good luck with your implementation details.

like image 120
Pierre-Adrien Avatar answered Nov 15 '22 23:11

Pierre-Adrien