Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to connect Jupyter Ipython notebook to Amazon redshift

I am using Mac Yosemite. I have installed the packages postgresql, psycopg2, and simplejson using conda install "package name". After the installation I have imported these packages. I tried to create a json file with my amazon redshift credentials

{
    "user_name": "YOUR USER NAME",
    "password": "YOUR PASSWORD",
    "host_name": "YOUR HOST NAME",
    "port_num": "5439",
    "db_name": "YOUR DATABASE NAME"
}

I used with

open("Credentials.json") as fh:
    creds = simplejson.loads(fh.read())

But this is throwing error. These were the instructions given on a website. I tried searching other websites but no site gives a good explanation.

Please let me know the ways I can connect the Jupyter to amazon redshift.

like image 715
SpaceOddity Avatar asked Aug 13 '16 21:08

SpaceOddity


People also ask

How do I connect my Jupyter Notebook?

Connect to a Jupyter server using OAuth authenticationicon on the toolbar of the Workspace tool window. Select Connect by URL and enter the target Jupyter server address in the Server URL field. Click the Link to the token page link. You will be prompted to login into GitHub.


2 Answers

There's a nice guide from RJMetrics here: "Setting up Your Analytics Stack with Jupyter Notebook & AWS Redshift". It uses ipython-sql

This works great and displays results in a grid.

In [1]:

import sqlalchemy
import psycopg2
import simplejson
%load_ext sql
%config SqlMagic.displaylimit = 10

In [2]:

with open("./my_db.creds") as fh:
    creds = simplejson.loads(fh.read())

connect_to_db = 'postgresql+psycopg2://' + \
                creds['user_name'] + ':' + creds['password'] + '@' + \
                creds['host_name'] + ':' + creds['port_num'] + '/' + creds['db_name'];
%sql $connect_to_db

In [3]:

% sql SELECT * FROM my_table LIMIT 25;
like image 51
Joe Harris Avatar answered Oct 09 '22 03:10

Joe Harris


Here's how I do it:

----INSERT IN CELL 1-----
import psycopg2
redshift_endpoint = "<add your endpoint>"
redshift_user = "<add your user>"
redshift_pass = "<add your password>"
port = <your port>
dbname = "<your db name>"

----INSERT IN CELL 2-----
from sqlalchemy import create_engine
from sqlalchemy import text
engine_string = "postgresql+psycopg2://%s:%s@%s:%d/%s" \
% (redshift_user, redshift_pass, redshift_endpoint, port, dbname)
engine = create_engine(engine_string)

----INSERT IN CELL 3 - THIS EXAMPLE WILL GET ALL TABLES FROM YOUR DATABASE-----
sql = """
select schemaname, tablename from pg_tables order by schemaname, tablename;
"""

----LOAD RESULTS AS TUPLES TO A LIST-----
tables = []
output = engine.execute(sql)
for row in output:
    tables.append(row)
tables

--IF YOU'RE USING PANDAS---
raw_data = pd.read_sql_query(text(sql), engine)
like image 20
jason_in_la Avatar answered Oct 09 '22 05:10

jason_in_la