Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a good way to memoize data across many instances of a class in Ruby?

Consider: many instances of an object that generates data. It would be great to only generate that data once per run.

class HighOfNPeriods < Indicator
  def generate_data
    @indicator_data = DataStream.new
    (0..@source_data.data.count - 1).each do |i|
      if i < @params[:n_days]
      ...
      @indicator_data.add one_data
    end
  end

There are different instances of HighOfNPeriods with different params and different @source_data.

Here is how the indicator is used:

class Strategy
  attr_accessor :indicators

  def initialize params
    ...
  end

The method HighOfNPeriods.generate_data is called from within Strategy. Each Strategy gets a new instance of HighOfNPeriods, so it's not possible to pull it out as some kind of global value. Besides that, it should not be global.

unless @indicator_data wouldn't work because the data needs to be shared across many instances of HighOfNPeriods.

So, the question is:

What is a good way to memoize the instance variable `indicator_data` 
or the object `HighOfNPeriods` when there are many different instances, 
some of which have different data?

One solution is to store the data using ActiveRecord, but that's not really the way I want to do it at this time because:

  1. Not all the generated data can be generated in advance because there are too many permutations of params. It makes more sense to see if it has been generated before and then generate (and save) as necessary.
  2. It doesn't take long to generate the data. It may be generated once and used hundreds of times each run.
  3. It will be faster to access the data from within the object than to pull it from the database.

Ruby 1.9.3

like image 869
B Seven Avatar asked Nov 25 '12 15:11

B Seven


2 Answers

If you can't memoize it on instance level, go one level up and use class instance.

class Foo
  # I used *args here for simplicity of the example code.
  def generate_data *args
    if res = self.class.cache[args]
      puts "Data in cache for #{args.inspect} is #{res}"
      return res
    end

    puts "calculating..."
    p1, p2 = args
    res = p1 * 10 + p2
    self.class.cache[args] = res
    res
  end

  def self.cache
    @cache ||= {}
    @cache
  end
end


puts Foo.new.generate_data 1, 2 
puts Foo.new.generate_data 3, 4
puts Foo.new.generate_data 1, 2
# >> calculating...
# >> 12
# >> calculating...
# >> 34
# >> Data in cache for [1, 2] is 12
# >> 12
like image 105
Sergio Tulentsev Avatar answered Oct 09 '22 18:10

Sergio Tulentsev


Make a class variable @@indicator_data that is a hash with [@params,@source_data] as the key and the @indicator_data as the value. Then, at creation, do a memoization on @@indicator_data[[@params,@source_data]].

class HighOfNPeriods < Indicator
  @@indicator_data = {}
  def generate_data
    @indicator_data = @@indicator_data[[@params, @source_data]] ||= DataStream.new
    ...
  end
end
like image 27
sawa Avatar answered Oct 09 '22 17:10

sawa