I am using Mac Yosemite. I have installed the packages postgresql, psycopg2, and simplejson using conda install "package name". After the installation I have imported these packages. I tried to create a json file with my amazon redshift credentials
{
"user_name": "YOUR USER NAME",
"password": "YOUR PASSWORD",
"host_name": "YOUR HOST NAME",
"port_num": "5439",
"db_name": "YOUR DATABASE NAME"
}
I used with
open("Credentials.json") as fh:
creds = simplejson.loads(fh.read())
But this is throwing error. These were the instructions given on a website. I tried searching other websites but no site gives a good explanation.
Please let me know the ways I can connect the Jupyter to amazon redshift.
Connect to a Jupyter server using OAuth authenticationicon on the toolbar of the Workspace tool window. Select Connect by URL and enter the target Jupyter server address in the Server URL field. Click the Link to the token page link. You will be prompted to login into GitHub.
There's a nice guide from RJMetrics here: "Setting up Your Analytics Stack with Jupyter Notebook & AWS Redshift". It uses ipython-sql
This works great and displays results in a grid.
In [1]:
import sqlalchemy
import psycopg2
import simplejson
%load_ext sql
%config SqlMagic.displaylimit = 10
In [2]:
with open("./my_db.creds") as fh:
creds = simplejson.loads(fh.read())
connect_to_db = 'postgresql+psycopg2://' + \
creds['user_name'] + ':' + creds['password'] + '@' + \
creds['host_name'] + ':' + creds['port_num'] + '/' + creds['db_name'];
%sql $connect_to_db
In [3]:
% sql SELECT * FROM my_table LIMIT 25;
Here's how I do it:
----INSERT IN CELL 1-----
import psycopg2
redshift_endpoint = "<add your endpoint>"
redshift_user = "<add your user>"
redshift_pass = "<add your password>"
port = <your port>
dbname = "<your db name>"
----INSERT IN CELL 2-----
from sqlalchemy import create_engine
from sqlalchemy import text
engine_string = "postgresql+psycopg2://%s:%s@%s:%d/%s" \
% (redshift_user, redshift_pass, redshift_endpoint, port, dbname)
engine = create_engine(engine_string)
----INSERT IN CELL 3 - THIS EXAMPLE WILL GET ALL TABLES FROM YOUR DATABASE-----
sql = """
select schemaname, tablename from pg_tables order by schemaname, tablename;
"""
----LOAD RESULTS AS TUPLES TO A LIST-----
tables = []
output = engine.execute(sql)
for row in output:
tables.append(row)
tables
--IF YOU'RE USING PANDAS---
raw_data = pd.read_sql_query(text(sql), engine)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With