So I have a Redshift database with enough tables that it feels worth my time to build a front end to make querying it a little bit easier than just typing in SQL commands.
Ideally, I'd be able to do this by connecting the database to a Rails app (because I have a bit of experience with Rails). I'm not sure how I'd connect a remote Redshift database to a local Rails application though, or how to make activerecord work with redshift.
Does anyone have any suggestions/resources to help me get started? I'm open to other options to connect the Redshift database to a front end if there are pre-made options easier than Rails.
To query databases hosted by your Amazon Redshift cluster, you have two options: Connect to your cluster and run queries on the AWS Management Console with the query editor. If you use the query editor on the Amazon Redshift console, you don't have to download and set up a SQL client application.
The SQL language consists of commands that you use to create and manipulate database objects, run queries, load tables, and modify the data in tables. Amazon Redshift is based on PostgreSQL.
To use the query editor on the Amazon Redshift console On the navigation menu, choose Query editor, then connect to a database in your cluster. For Schema, choose public to create a new table based on that schema. Enter the following in the query editor window and choose Run to create a new table.
The Query Editor provides an in-browser interface for running SQL queries on Amazon Redshift clusters. For queries that are run on compute nodes, you can then view the query results and query execution plan next to your queries.
#app/models/data_warehouse.rb
class DataWarehouse < ActiveRecord::Base
establish_connection "redshift_staging"
#or, if you want to have a db per environment
#establish_connection "redshift_#{Rails.env}"
end
Note that we are connecting on 5439, not the default 5432 so I specify the port Also, I specify a schema, beta, which is what we use for our unstable aggregates, you could either have a different db per environment as mentioned above, or use various schemas and include them in the search path for ActiveRecord
#config/database.yml
redshift_staging:
adapter: postgresql
encoding: utf8
database: db03
port: 5439
pool: 5
schema_search_path: 'beta'
username: admin
password: supersecretpassword
host: db03.myremotehost.us #your remote host here, might be an aws url from Redshift admin console
###OPTION 2, a direct PG Connection
class DataWarehouse < ActiveRecord::Base
attr_accessor :conn
def initialize
@conn = PG.connect(
database: 'db03',
port: 5439,
pool: 5,
schema_search_path: 'beta',
username: 'admin',
password: 'supersecretpassword',
host: 'db03.myremotehost.us'
)
end
end
[DEV] main:0> redshift = DataWarehouse
E, [2014-07-17T11:09:17.758957 #44535] ERROR -- : PG::InsufficientPrivilege: ERROR: permission denied to set parameter "client_min_messages" to "notice" : SET client_min_messages TO 'notice'
(pry) output error: #<ActiveRecord::StatementInvalid: PG::InsufficientPrivilege: ERROR: permission denied to set parameter "client_min_messages" to "notice" : SET client_min_messages TO 'notice'>
UPDATE:
https://github.com/fiksu/activerecord-redshift-adapter
Reason 1: ActiveRecord postgresql adapter sets client_min_messages Reason 2: adapter also attempts to set Time Zone, which redshift doesn't allow (http://docs.aws.amazon.com/redshift/latest/dg/c_redshift-and-postgres-sql.html) Reason 3: Even if you change the code in ActiveRecord for the first two errors, you run into additional errors that complain that Redshift is using Postgresql 8.0, at that point I moved on to the adapter, will revisit and update if I find something better later.
#Gemfile:
gem 'activerecord4-redshift-adapter', github: 'aamine/activerecord4-redshift-adapter'
#config/database.yml
redshift_staging:
adapter: redshift
encoding: utf8
database: db03
port: 5439
pool: 5
username: admin
password: supersecretpassword
host: db03.myremotehost.us
timeout: 5000
#app/models/base_aggregates_redshift_test.rb
#Model named to match my tables in Redshift, if you want you can set_table like I have commented out below
class BaseAggregatesRedshiftTest < ActiveRecord::Base
establish_connection "redshift_staging"
self.table_name = "beta.base_aggregates_v2"
end
[DEV] main:0> redshift = BaseAggregatesRedshiftTest.first
D, [2014-07-17T15:31:58.678103 #43776] DEBUG -- : BaseAggregatesRedshiftTest Load (45.6ms) SELECT "beta"."base_aggregates_v2".* FROM "beta"."base_aggregates_v2" LIMIT 1
#app/models/base_aggregates_redshift_test.rb
class BaseAggregatesRedshiftTest < ActiveRecord::Base
set_table "beta.base_aggregates_v2"
ActiveRecord::Base.establish_connection(
adapter: 'redshift',
encoding: 'utf8',
database: 'staging',
port: '5439',
pool: '5',
username: 'admin',
password: 'supersecretpassword',
search_schema: 'beta',
host: 'db03.myremotehost.us',
timeout: '5000'
)
end
#in console, abbreviated example of first record, now it's using the new name for my redshift table, just assuming I've got the record at base_aggregates_redshift_tests because I didn't set the table_name
[DEV] main:0> redshift = BaseAggregatesRedshiftTest.first
D, [2014-07-17T15:09:39.388918 #11537] DEBUG -- : BaseAggregatesRedshiftTest Load (45.3ms) SELECT "base_aggregates_redshift_tests".* FROM "base_aggregates_redshift_tests" LIMIT 1
#<BaseAggregatesRedshiftTest:0x007fd8c4a12580> {
:truncated_month => Thu, 31 Jan 2013 19:00:00 EST -05:00,
:dma => "Cityville",
:group_id => 9712338,
:dma_id => 9999
}
Good luck @johncorser!
This tutorial helps you setup a rails app with a redshift adapter:
https://www.credible.com/code/setting-up-a-data-warehouse-with-aws-redshift-and-ruby/
In a nutshell:
Clone the sample app:
git clone [email protected]:tuesy/redshift-ruby-tutorial.git
cd redshift-ruby-tutorial
Setup ENV Variables via ~/.bashrc
(or dotenv):
export REDSHIFT_HOST=redshift-ruby-tutorial.ccmj2nxbsay7.us-east-1.redshift.amazonaws.com
export REDSHIFT_PORT=5439
export REDSHIFT_USER=deploy
export REDSHIFT_PASSWORD=<your password here>
export REDSHIFT_DATABASE=analytics
export REDSHIFT_BUCKET=redshift-ruby-tutorial
Use the gem activerecord4-redshift-adapter
, in Gemfile:
'activerecord4-redshift-adapter', '~> 0.2.0' # For Rails 4.2
'activerecord4-redshift-adapter', '~> 0.1.1' # For Rails 4.1
Then you can query into redshift like you would with a normal AR model:
bundle exec rails c
RedshiftUser.count
(Disclosure: I havn't yet given this method a try, but I may soon)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With