Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copying data from S3 to AWS redshift using python and psycopg2

I'm having issues executing the copy command to load data from S3 to Amazon's Redshift from python.
I have the following copy command:

copy moves from 's3://<my_bucket_name>/moves_data/2013-03-24/18/moves'
credentials 'aws_access_key_id=<key_id>;aws_secret_access_key=<key_secret>'
removequotes
delimiter ',';

When I execute this command using SQL Workbench/j everything works as expected, however when I try to execute this with python and psycopg2 the command pass OK but no data is loaded and no error is thrown.
tried the following two options (assume psycopg2 connection is OK because it is):

cursor.execute(copy_command)  
cursor.copy_expert(copy_command, sys.stdout)

both pass with no warning yet data isn't loaded

Ideas?

Thanks

like image 411
Yaniv Golan Avatar asked Mar 24 '13 17:03

Yaniv Golan


1 Answers

I have used this exact setup (psycopg2 + redshift + COPY) successfully. Did you commit afterwards? SQL Workbench defaults to auto-commit while psycopg2 defaults to opening a transaction, so the data won't be visible until you call commit() on your connection.

The full workflow is:

conn = psycopg2.connect(...)
cur = conn.cursor()
cur.execute("COPY...")
conn.commit()

I don't believe that copy_expert() or any of the cursor.copy_* commands work with Redshift.

like image 78
Voket Avatar answered Oct 10 '22 08:10

Voket