How to Load Data into Amazon Redshift via Python Boto3?

Question

In Amazon Redshift's Getting Started Guide, data is pulled from Amazon S3 and loaded into an Amazon Redshift Cluster utilizing SQLWorkbench/J. I'd like to mimic the same process of connecting to the cluster and loading sample data into the cluster utilizing Boto3.

However in Boto3's documentation of Redshift, I'm unable to find a method that would allow me to upload data into Amazon Redshift cluster.

I've been able to connect with Redshift utilizing Boto3 with the following code:

client = boto3.client('redshift')

But I'm not sure what method would allow me to either create tables or upload data to Amazon Redshift the way it's done in the tutorial with SQLWorkbenchJ.

Alex B · Accepted Answer

Right, you need psycopg2 Python module to execute COPY command.

My code looks like this:

import psycopg2
#Amazon Redshift connect string 
conn_string = "dbname='***' port='5439' user='***' password='***' host='mycluster.***.redshift.amazonaws.com'"  
#connect to Redshift (database should be open to the world)
con = psycopg2.connect(conn_string);
sql="""COPY %s FROM '%s' credentials 
      'aws_access_key_id=%s; aws_secret_access_key=%s'
       delimiter '%s' FORMAT CSV %s %s; commit;""" % 
      (to_table, fn, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY,delim,quote,gzip)

#Here
#  fn - s3://path_to__input_file.gz
#  gzip = 'gzip'

cur = con.cursor()
cur.execute(sql)
con.close()

I used boto3/psycopg2 to write CSV_Loader_For_Redshift

Mark B · Answer

Go back to step 4 in that tutorial you linked. See where it shows you how to get the URL of the cluster? You have to connect to that URL with a PostgreSQL driver. The AWS SDKs such as Boto3 provide access to the AWS API. You need to connect to Redshift over a PostgreSQL API, just like you would connect to a PostgreSQL database on RDS.

How to Load Data into Amazon Redshift via Python Boto3?

Tags:

python

amazon-web-services

amazon-s3

boto3

amazon-redshift

Chris

2 Answers

Alex B

Mark B

Recent Activity

Donate For Us

How to Load Data into Amazon Redshift via Python Boto3?

Tags:

python

amazon-web-services

amazon-s3

boto3

amazon-redshift

Chris

2 Answers

Alex B

Mark B

Related questions

Recent Activity

Donate For Us