Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Talk to data warehouse-style tables with ActiveRecord?

As my Rails app matures, it's becoming increasingly apparent that it has a strong data warehouse flavour, lacking only a facts table to make everything explicit.

On top of that, I just read Chapters 2 (Designing Beautiful APIs) and 3 (Mastering the Dynamic Toolkit) of Ruby Best Practices.

Now I'm trying to figure out how best to design the fact-retrieving part...

Say I have the following dimensions (existing Models in the app):

  • Product (contains funds)
  • Fund
  • Measure (e.g. total holding, average holding, average exposure)

... and a good old general-purpose Fact:

  • Fact (date, value, plus a foreign-key NULLable column for each of my dimensions)

Some aspects on which I'd be grateful to get some advice:

  • What might constitute a flexible retrieval interface?
  • What happens if I have Facts with both NULL (i.e. all, or don't care) and NOT NULL (specific) values for a dimension? A pseudo-value like :all? Or should some convention apply?
  • How to select only a subset of dimension values? Or exclude a subset? :only and :exclude?
  • Has anyone had experience with creating named_scopes to deal with this? There's the obvious attraction of being able to chain one for each dimension of interest, but does it get too clunky if we get to 7 or 8 dimensions?

(I'm aware that an acts_as_fact plugin is reputed to exist in some form (at least, there was some small buzz at RailsConf 2006) but I couldn't find any code or description of how it might have worked.)

Versions: Rails, ActiveRecord 2.1.2, Oracle Enhanced Adapter 1.2.0

EDIT: I looked at ActiveWarehouse and have some reservations: - the main branch hasn't had a commit since Nov-08 and no there's activity at all since Jan-09; - the tutorial dates to 2006, is admitted to be out of date, and 404s on me; - it seems to be wanting to get away from ActiveRecord - much of my app will stay in AR and I think at present that I want an AR solution.

So I'll steer clear of that one, thanks!

like image 653
Mike Woodhouse Avatar asked Oct 15 '22 14:10

Mike Woodhouse


1 Answers

What happens if I have Facts with both NULL (i.e. all, or don't care) and NOT NULL (specific) values for a dimension? A pseudo-value like :all? Or should some convention apply?

NULL would be a bid misleading because it stands for no association. I would use a value like -1 (if it is an integer foreign_key with only values > 0).

How to select only a subset of dimension values? Or exclude a subset?

with_scope()

you could also overwrite the find function

   def self.find(*args)
    if  anything
      with_scope(a_scope) do
         result = super *args
      end
    else
      result = super *args
    end
   end

   def self.a_scope
    {:find => { :conditions => ["person_id  = ?", me] , :readonly => true}}
   end

Has anyone had experience with creating named_scopes to deal with this? There's the obvious attraction of being able to chain one for each dimension of interest, but does it get too clunky if we get to 7 or 8 dimensions?

We have an olap database with 4 dimension and it works nice. I think if you implement some custom methods for active_record, you will have fun with your app.

I also found this: http://github.com/aeden/activewarehouse/tree/master

like image 92
Beffa Avatar answered Oct 20 '22 05:10

Beffa