Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy update multiple rows in one transaction

How can I update multiple, existing rows in a database, using dictionary that maps existing values for one column, to the required new values for another column?

I have a table:

class MyTable(BaseModel):
    col1 = sa.Column(sa.String(256))
    col2 = sa.Column(sa.String(256))

Given that col1 has already values and col2 is empty, how can I update col2 if I have the set of data as a dictionary:

payload = {'x': 'y', 'a': 'b', 'c': 'd'}

So this payload maps values for col1, to a new value for col2; after the update you'd get [{'col1': 'x', 'col2': 'y'}, ...] from the database.

I tried a couple of ways, which actually work but I think they are not as optimal as it could be ex.:

my_table = MyTable.__table__
for key, value in payload.items():
    stm = my_table.update()
    stm = stm.where(getattr(sales_order_item.c, 'col1') == key)
    stm = stm.values({'col2': value})
    session.execute(stm)

Or like this

for key, value in payload.items():
    query = session.query(MyTable).filter(MyTable.col1==key)
    query.update({MyTable.col2: value})

Now both of these solutions work as expected the only thing that is bothering me is the time it takes, for example for a payload of 100 elements it takes up to 6 sec, and I'm almost sure that there should be a better way to do that, isn't it?

I was thinking if there is a way of making it work with the in_ function:

query(MyTable).filter(
        MyTable.col1.in_(payload.keys())
    )

but I don't know how to structure the update query.

like image 305
sken3r Avatar asked Jan 25 '19 13:01

sken3r


People also ask

How do I update data in SQLAlchemy?

Update table elements in SQLAlchemy. Get the books to table from the Metadata object initialized while connecting to the database. Pass the update query to the execute() function and get all the results using fetchall() function. Use a for loop to iterate through the results.

What does SQLAlchemy all () return?

As the documentation says, all() returns the result of the query as a list.

What is relationship SQLAlchemy?

The relationship function is a part of Relationship API of SQLAlchemy ORM package. It provides a relationship between two mapped classes. This corresponds to a parent-child or associative table relationship.

How many components are present in SQLAlchemy?

SQLAlchemy consists of two distinct components, known as the Core and the ORM.


1 Answers

Yes, updating a larger number of rows with a single bulk UPDATE statement will be a lot faster than using individual UPDATEs on each and every object. An IN filter would only help you limit what rows are updated, but you still need to tell the database what value to use for the col2 updates.

You can use a CASE ... WHEN ... THEN construct for that, with the case() function:

from sqlalchemy.sql import case

query(MyTable).filter(
    MyTable.col1.in_(payload)
).update({
    MyTable.col2: case(
        payload,
        value=MyTable.col1,
    )
}, synchronize_session=False)

The above a) selects rows where the col1 value is a key in the payload dictionary, then b) updates the col2 column value using a CASE statement that picks values from that same dictionary to update that column based on matching col1 against the keys.

With payload set to {'x': 'y', 'a': 'b', 'c': 'd'}, the above executes the following query (give or take the exact order of WHEN clauses and values in the IN test):

UPDATE mytable
SET
    col2=CASE mytable.col1
        WHEN 'x' THEN 'y'
        WHEN 'a' THEN 'b'
        WHEN 'c' THEN 'd'
    END
WHERE
    mytable.col1 IN ('x', 'a', 'c')

I set synchronize_session to False there, as updating all possible cached MyTable instances at once is perhaps not the best idea when updating a large number of rows. Your other options are 'evaluate' and 'fetch'.

  • We can't use the default 'evaluate' (which would find existing objects in the session that match the where clause, to update in-place), because SQLAlchemy currently doesn't know how to process an IN filter (you get an UnevaluatableError exception).

  • If you do use 'fetch' then all instances of MyTable cached in the session that were affected are updated with new values for col2 (as mapped by their primary key).

Note that a commit would expire the session anyway, so you'd only want to use 'fetch' if you need to do some more work with the updated rows before you can commit the current transaction.

See the Query.update() documentation for more information on what synchronize_session options you have.

like image 157
Martijn Pieters Avatar answered Sep 20 '22 14:09

Martijn Pieters