Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bulk Upsert with SQLAlchemy Postgres

I'm following the SQLAlchemy documentation here to write a bulk upsert statement with Postgres. For demonstration purposes, I have a simple table MyTable:

class MyTable(base):
    __tablename__ = 'mytable'
    id = Column(types.Integer, primary_key=True)
    test_value = Column(types.Text)

Creating a generic insert statement is simple enough:

from sqlalchemy.dialects import postgresql

values = [{'id': 0, 'test_value': 'a'}, {'id': 1, 'test_value': 'b'}]
insert_stmt = postgresql.insert(MyTable.__table__).values(values)

The problem I run into is when I try to add the "on conflict" part of the upsert.

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

Trying to execute this statement yields a ProgrammingError:

from sqlalchemy import create_engine
engine = create_engine('postgres://localhost/db_name')

engine.execute(update_stmt)

>>> ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'dict'

I think my misunderstanding is in constructing the statement with the on_conflict_do_update method. Does anyone know how to construct this statement? I have looked at other questions on StackOverflow (eg. here) but I can't seem to a way to address the above error.

like image 923
user144153 Avatar asked Mar 27 '19 00:03

user144153


1 Answers

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_=dict(data=values)
)

index_elements should either be a list of strings or a list of column objects. So either [MyTable.id] or ['id'] (This is correct)

set_ should be a dictionary with column names as keys and valid sql update objects as values. You can reference values from the insert block using the excluded attribute. So to get the result you are hoping for here you would want set_={'test_value': insert_stmt.excluded.test_value} (The error you made is that data= in the example isn't a magic argument... it was the name of the column on their example table)

So, the whole thing would be

update_stmt = insert_stmt.on_conflict_do_update(
    index_elements=[MyTable.id],
    set_={'test_value': insert_stmt.excluded.test_value}
)

Of course, in a real world example I usually want to change more then one column. In that case I would do something like...

update_columns = {col.name: col for col in insert_stmt.excluded if col.name not in ('id', 'datetime_created')}
update_statement = insert_stmt.on_conflict_do_update(index_elements=['id'], set_=update_columns)

(This example would overwrite every column except for the id and datetime_created columns)

like image 127
Paul Becotte Avatar answered Oct 19 '22 21:10

Paul Becotte