What is the correct way to update an slqalchemy orm column from a pandas dataframe column

Tags:

I've loaded some data and modified one column in the dataframe and would like to update the DB to reflect the changes.

I tried:

db.session.query(sqlTableName).update({sqlTableName.sql_col_name: pdDataframe.pd_col_name})

But that just wiped out the column in the database (set every value to '0', the default). I tried a few other dataformats with no luck. I'm guessing that there is something funky going on with datatypes that I've mixed up, or you just aren't allowed to update a column with a variable like this directly.

I could do this with a loop but... that would be genuinely awful. Sorry for the basic question, after a long break from a project, my grasp of sqlalchemy has certainly waned.

972

asked Aug 11 '21 21:08

Ambiwlans

1 Answers

For uploading the DataFrame to a temporary table and then performing an UPDATE you don't need to write the SQL yourself, you can have SQLAlchemy Core do it for you:

import pandas as pd
import sqlalchemy as sa


def update_table_columns_from_df(engine, df, table_name, cols_to_update):
    metadata = sa.MetaData()
    main_table = sa.Table(table_name, metadata, autoload_with=engine)
    pk_columns = [x.name for x in main_table.primary_key.columns]

    df.to_sql("temp_table", engine, index=False, if_exists="replace")

    temp_table = sa.Table("temp_table", metadata, autoload_with=engine)
    with engine.begin() as conn:
        values_clause = {x: temp_table.columns[x] for x in cols_to_update}
        where_clause = sa.and_(
            main_table.columns[x] == temp_table.columns[x] for x in pk_columns
        )
        conn.execute(
            main_table.update().values(values_clause).where(where_clause)
        )
    temp_table.drop(engine)


if __name__ == "__main__":
    test_engine = sa.create_engine(
        "postgresql+psycopg2://scott:[email protected]/test",
        echo=True,  # (for demonstration purposes)
    )
    with test_engine.begin() as test_conn:
        test_conn.exec_driver_sql("DROP TABLE IF EXISTS main_table")
        test_conn.exec_driver_sql(
            """\
            CREATE TABLE main_table ( 
            id1 integer NOT NULL,
            id2 integer NOT NULL,
            txt1 varchar(50),
            txt2 varchar(50),
            CONSTRAINT main_table_pkey PRIMARY KEY (id1, id2)
            )
            """
        )
        test_conn.exec_driver_sql(
            """\
            INSERT INTO main_table (id1, id2, txt1, txt2)
            VALUES (1, 1, 'foo', 'x'), (1, 2, 'bar', 'y'), (1, 3, 'baz', 'z')
            """
        )

    df_updates = pd.DataFrame(
        [
            (1, 1, "new_foo", "new_x"),
            (1, 3, "new_baz", "new_z"),
        ],
        columns=["id1", "id2", "txt1", "txt2"],
    )
    update_table_columns_from_df(
        test_engine, df_updates, "main_table", ["txt1", "txt2"]
    )
    """SQL emitted:
    UPDATE main_table 
    SET txt1=temp_table.txt1, txt2=temp_table.txt2 
    FROM temp_table 
    WHERE main_table.id1 = temp_table.id1 AND main_table.id2 = temp_table.id2
    """

    df_result = pd.read_sql_query(
        "SELECT * FROM main_table ORDER BY id1, id2", test_engine
    )
    print(df_result)
    """
       id1  id2     txt1   txt2
    0    1    1  new_foo  new_x
    1    1    2      bar      y
    2    1    3  new_baz  new_z
    """

172

answered Nov 15 '22 06:11

Gord Thompson

Related questions
                            
                                NGINX + Flask, without Gunicorn?
                            
                                Get coordinates of quiver arrow (tip and bottom) when plotting in 'uv' mode
                            
                                How to Programmatically detect whether a file is a Python script
                            
                                How to fix /usr/local/bin/virtualenv: /usr/bin/python: bad interpreter: No such file or directory?
                            
                                Extended example to understand CUDA, Numba, Cupy, etc
                            
                                When/Where does PyPy produce machine code?
                            
                                error when using Mirrored strategy in Tensorflow
                            
                                How do I parse a chemical formula using a regular expression?
                            
                                Interesting results with duplicate columns in pandas.DataFrame
                            
                                How to use the kubernetes-client for executing "kubectl apply"
                            
                                Failed to build opencv-contrib-python (On Rasberry Pi)
                            
                                Shap installation
                            
                                how to set WSGI of appache2 to work with python 3.7?
                            
                                flask-ngrok returns "Tunnel _________.ngrok.io not found" when running flask app via ngrok on Google Colab [duplicate]
                            
                                How to handle odd resolutions in Unet architecture PyTorch
                            
                                Keras custom loss function to ignore false negatives of a specific class during semantic segmentation?
                            
                                Pandas changing values when inferring dtypes
                            
                                Error while Importing pyspark ETL module and running as child process using pything subprocess
                            
                                Can getline() be used multiple times within a loop? - Cython, file reading
                            
                                Reorder Sankey diagram vertically based on label value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the correct way to update an slqalchemy orm column from a pandas dataframe column

Tags:

python

pandas

sqlalchemy

Ambiwlans

People also ask

1 Answers

Gord Thompson

Recent Activity

Donate For Us