Is there any way to do an SQL update-where from a dataframe without iterating through each line? I have a postgresql database and to update a table in the db from a dataframe I would use psycopg2 and do something like:
con = psycopg2.connect(database='mydb', user='abc', password='xyz') cur = con.cursor() for index, row in df.iterrows(): sql = 'update table set column = %s where column = %s' cur.execute(sql, (row['whatver'], row['something'])) con.commit()
But on the other hand if im either reading a table from sql or writing an entire dataframe to sql (with no update-where), then I would just use pandas and sqlalchemy. Something like:
engine = create_engine('postgresql+psycopg2://user:pswd@mydb') df.to_sql('table', engine, if_exists='append')
It's great just having a 'one-liner' using to_sql. Isn't there something similar to do an update-where from pandas to postgresql? Or is the only way to do it by iterating through each row like i've done above. Isn't iterating through each row an inefficient way to do it?
This main difference can mean that the two tools are separate, however, you can also perform several of the same functions in each respective tool, for example, you can create new features from existing columns in pandas, perhaps easier and faster than in SQL.
Pandasql is a python library that allows manipulation of a Pandas Dataframe using SQL. Under the hood, Pandasql creates an SQLite table from the Pandas Dataframe of interest and allow users to query from the SQLite table using SQL.
Consider a temp table which would be exact replica of your final table, cleaned out with each run:
engine = create_engine('postgresql+psycopg2://user:pswd@mydb') df.to_sql('temp_table', engine, if_exists='replace') sql = """ UPDATE final_table AS f SET col1 = t.col1 FROM temp_table AS t WHERE f.id = t.id """ with engine.begin() as conn: # TRANSACTION conn.execute(sql)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With