Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient Ruby LRU cache

Tags:

What's the most efficient way to build a cache with arbitrary Ruby objects as keys that are expired based on a least recently used algorithm. It should use Ruby's normal hashing semantics (not equal?)

like image 328
Yehuda Katz Avatar asked Dec 19 '09 19:12

Yehuda Katz


2 Answers

I know its a few years late, but I just implemented what I believe is the fastest LRU Cache out there for Ruby.

It is also tested and optionally safe to use in multi threaded environments.

https://github.com/SamSaffron/lru_redux


Note: in Ruby 1.9 Hash is ordered, so you can cheat and build the fastest LRU cache in a few lines of code

class LruRedux::Cache19    def initialize(max_size)     @max_size = max_size     @data = {}   end    def max_size=(size)     raise ArgumentError.new(:max_size) if @max_size < 1     @max_size = size     if @max_size < @data.size       @data.keys[0..@[email protected]].each do |k|         @data.delete(k)       end     end   end    def [](key)     found = true     value = @data.delete(key){ found = false }     if found       @data[key] = value     else       nil     end   end    def []=(key,val)     @data.delete(key)     @data[key] = val     if @data.length > @max_size       @data.delete(@data.first[0])     end     val   end    def each     @data.reverse.each do |pair|       yield pair     end   end    # used further up the chain, non thread safe each   alias_method :each_unsafe, :each    def to_a     @data.to_a.reverse   end    def delete(k)     @data.delete(k)   end    def clear     @data.clear   end    def count     @data.count   end    # for cache validation only, ensures all is sound   def valid?     true   end end 
like image 197
Sam Saffron Avatar answered Sep 18 '22 13:09

Sam Saffron


This pushes the boundaries of my understanding of how Ruby uses memory, but I suspect that the most efficient implementation would be a doubly-linked list where every access moves the key to the front of the list, and every insert drops an item if the maximum size has been reached.

However, assuming Ruby's Hash class is already very efficient, I'd bet that the somewhat naive solution of simply adding age data to a Hash would be fairly good. Here's a quick toy example that does this:

class Cache   attr_accessor :max_size    def initialize(max_size = 4)     @data = {}     @max_size = max_size   end    def store(key, value)     @data.store key, [0, value]     age_keys     prune   end    def read(key)     if value = @data[key]       renew(key)       age_keys     end     value   end    private # -------------------------------    def renew(key)     @data[key][0] = 0   end    def delete_oldest     m = @data.values.map{ |v| v[0] }.max     @data.reject!{ |k,v| v[0] == m }   end    def age_keys     @data.each{ |k,v| @data[k][0] += 1 }   end    def prune     delete_oldest if @data.size > @max_size   end end 

There's probably a faster way of finding the oldest item, and this is not thoroughly tested, but I'd be curious to know how anyone thinks this compares to a more sophisticated design, linked list or otherwise.

like image 35
Alex Reisner Avatar answered Sep 21 '22 13:09

Alex Reisner