Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write data to Redshift that is a result of a dataframe created in Python?

I have a dataframe in Python. Can I write this data to Redshift as a new table? I have successfully created a db connection to Redshift and am able to execute simple sql queries. Now I need to write a dataframe to it.

like image 438
Sahil Avatar asked Jul 15 '16 18:07

Sahil


People also ask

How do you get data from a DataFrame in Python?

Pandas DataFrame get() Method The get() method returns the specified column(s) from the DataFrame. If you specify only one column, the return value is a Pandas Series object. To specify more than one column, specify the columns inside an array. The result will be a new DataFrame object.

Can we query Redshift tables from Python?

Python is a popular Open Source programming language that contains libraries to perform advanced statistical operations for Data Analysis. By setting up the Python Redshift connection you can query your data and visualize it by generating graphs & charts using the inbuilt python libraries.


1 Answers

You can use to_sql to push data to a Redshift database. I've been able to do this using a connection to my database through a SQLAlchemy engine. Just be sure to set index = False in your to_sql call. The table will be created if it doesn't exist, and you can specify if you want you call to replace the table, append to the table, or fail if the table already exists.

from sqlalchemy import create_engine import pandas as pd  conn = create_engine('postgresql://username:[email protected]:5439/yourdatabase')  df = pd.DataFrame([{'A': 'foo', 'B': 'green', 'C': 11},{'A':'bar', 'B':'blue', 'C': 20}])  df.to_sql('your_table', conn, index=False, if_exists='replace') 

Note that you may need to pip install psycopg2 in order to connect to Redshift through SQLAlchemy.

to_sql Documentation

like image 166
Andrew Avatar answered Sep 20 '22 23:09

Andrew