I am trying to write a dataframe from pandas to redshift.
here is the code
df = pd.DataFrame({'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
from sqlalchemy import create_engine
import sqlalchemy
sql_engine = create_engine('postgresql://username:password@host:port/dbname')
conn = sql_engine.raw_connection()
df.to_sql('tmp_table', conn, index = False, if_exists = 'replace')
However, I get the following error
---------------------------------------------------------------------------
UndefinedTable Traceback (most recent call last)
~/opt/anaconda3/envs/UserExperience/lib/python3.7/site-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
1594 else:
-> 1595 cur.execute(*args)
1596 return cur
UndefinedTable: relation "sqlite_master" does not exist
...
...
...
1593 cur.execute(*args, **kwargs)
1594 else:
-> 1595 cur.execute(*args)
1596 return cur
1597 except Exception as exc:
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': relation "sqlite_master" does not exist
I tried to user pandas_redshift
however, seems first one has to upload to s3 bucket and then to the redshift. I would like to directly upload. Similarly, Here I see the answer suggest to upload to s3 first and then to the redshift
I can read and do query on the database using the same connection.
I just had the same issue and using engine did the trick, try the following:
import sqlalchemy
engine = sqlalchemy.create_engine('postgres://username:password@url:5439/db_name')
print(bool(engine)) # <- just to keep track of the process
with engine.connect() as conn:
print(bool(conn)) # <- just to keep track of the process
df.to_sql(name=table_name, con=engine)
print("end") # <- just to keep track of the process
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With