Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Connecting Python/pandas to Redshift when SSL is required

My company recently changed our Redshift cluster and now they require an SSL connection. In the past I've connected Python/pandas to Redshift via the method I've detailed here: http://measureallthethin.gs/blog/connect-python-and-pandas-to-redshift/

From the SQLAlchemy documentation, looks like all I need to do is add connect_args={'sslmode':'require'} to the create_engine() call, as this thread pointed out: How do I connect to Postgresql using SSL from SqlAchemy+pg8000?

However, I now get this error:

OperationalError: (psycopg2.OperationalError) sslmode value "require" invalid when SSL support is not compiled in

I use the Anaconda distribution for a number of packages, and found I needed to update my psycopg2 package per these instructions: https://groups.google.com/a/continuum.io/d/msg/conda/Fqv93VKQXAc/mHqfNK8xZWsJ

However, even after updating psycopg2 I'm still getting the same error and am at a loss at this point on how to further debug. I'd like to figure this out so I can get our Redshift data directly into pandas.

like image 241
measureallthethings Avatar asked Mar 17 '26 23:03

measureallthethings


1 Answers

AWS has developed an Amazon Redshift connector for Python (here is the GitHub repo) that helps in the process.

In order to install it on may install from the source

git clone https://github.com/aws/amazon-redshift-python-driver.git
cd redshift_connector
pip install .

Or from the binary using PyPi

pip install redshift_connector

Or Conda

conda install -c conda-forge redshift_connector

Here is an example

import redshift_connector

# Connects to Redshift cluster using AWS credentials
conn = redshift_connector.connect(
    host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',
    database='dev',
    user='awsuser',
    password='my_password'
 )

cursor: redshift_connector.Cursor = conn.cursor()
cursor.execute("create Temp table book(bookname varchar,author varchar)")
cursor.executemany("insert into book (bookname, author) values (%s, %s)",
                    [
                        ('One Hundred Years of Solitude', 'Gabriel García Márquez'),
                        ('A Brief History of Time', 'Stephen Hawking')
                    ]
                  )
cursor.execute("select * from book")

result: tuple = cursor.fetchall()
print(result)
>> (['One Hundred Years of Solitude', 'Gabriel García Márquez'], ['A Brief History of Time', 'Stephen Hawking'])

Note that one of the Connection Parameters one can pass is SSL (If SSL is enabled). The default value is TRUE.

like image 130
Gonçalo Peres Avatar answered Mar 20 '26 12:03

Gonçalo Peres



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!