Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ETL framework for loading data into Rails app

I need to load data for my Rails application from multiple providers (REST/SOAP based XML feeds) into the database on a recurring basis. I have written a set of Rake tasks which are kicked off by whenever-generated cron jobs. Each task hits the partner feed endpoint, parses the feed and loads it into the database.

Instead of writing Rake tasks, should I use an ETL framework like ActiveWarehouse (http://activewarehouse.rubyforge.org/etl/) instead? Any suggestions on the best way to do this in Rails?

like image 305
Bilal and Olga Avatar asked Jan 18 '10 21:01

Bilal and Olga


1 Answers

If you are just loading data into a set of tables, and the use case is simple such as just adding new records or updating basic ones, and your load is meeting your requirements, I would stick with that. You could certainly use ActiveWarehouse as well, but it sounds like overkill. If, however, you need to support changing dimensions (ie. preserve history of data changes over time), or other 'data warehouse' features, then something like ActiveWarehouse starts to have more value.

like image 99
Cliff Darling Avatar answered Oct 01 '22 23:10

Cliff Darling