I have a table with a single primary key. When I attempt to do an insert there may be a conflict caused by trying to insert a row with an existing key. I want to allow the insert to update all columns? Is there any easy syntax for this? I am trying to let it "upsert" all columns.
I am using PostgreSQL 9.5.5.
You must have INSERT privilege on a table in order to insert into it. If ON CONFLICT DO UPDATE is present, UPDATE privilege on the table is also required. If a column list is specified, you only need INSERT privilege on the listed columns.
First, specify the table name that you want to change data in the UPDATE clause. Second, assign a new value for the column that you want to update. In case you want to update data in multiple columns, each column = value pair is separated by a comma (,). Third, specify which rows you want to update in the WHERE clause.
Introduction to the PostgreSQL upsert The idea is that when you insert a new row into the table, PostgreSQL will update the row if it already exists, otherwise, it will insert the new row. That is why we call the action is upsert (the combination of update or insert).
The UPDATE
syntax requires to explicitly name target columns. Possible reasons to avoid that:
"All columns"
has to mean "all columns of the target table" (or at least "leading columns of the table") in matching order and matching data type. Else you'd have to provide a list of target column names anyway.
Test table:
CREATE TABLE tbl ( id int PRIMARY KEY , text text , extra text ); INSERT INTO tbl AS t VALUES (1, 'foo') , (2, 'bar');
DELETE
& INSERT
in single query insteadWithout knowing any column names except id
.
Only works for "all columns of the target table". While the syntax even works for a leading subset, excess columns in the target table would be reset to NULL with DELETE
and INSERT
.
UPSERT (INSERT ... ON CONFLICT ...
) is needed to avoid concurrency / locking issues under concurrent write load, and only because there is no general way to lock not-yet-existing rows in Postgres (value locking).
Your special requirement only affects the UPDATE
part. Possible complications do not apply where existing rows are affected. Those are locked properly. Simplifying some more, you can reduce your case to DELETE
and INSERT
:
WITH data(id) AS ( -- Only 1st column gets explicit name! VALUES (1, 'foo_upd', 'a') -- changed , (2, 'bar', 'b') -- unchanged , (3, 'baz', 'c') -- new ) , del AS ( DELETE FROM tbl AS t USING data d WHERE t.id = d.id -- AND t <> d -- optional, to avoid empty updates ) -- only works for complete rows INSERT INTO tbl AS t TABLE data -- short for: SELECT * FROM data ON CONFLICT (id) DO NOTHING RETURNING t.id;
In the Postgres MVCC model, an UPDATE
is largely the same as DELETE
and INSERT
anyway (except for some corner cases with concurrency, HOT updates, and big column values stored out of line). Since you want to replace all rows anyway, just remove conflicting rows before the INSERT
. Deleted rows remain locked until the transaction is committed. The INSERT
might only find conflicting rows for previously non-existing key values if a concurrent transaction happens to insert them concurrently (after the DELETE
, but before the INSERT
).
You would lose additional column values for affected rows in this special case. No exception raised. But if competing queries have equal priority, that's hardly a problem: the other query won for some rows. Also, if the other query is a similar UPSERT, its alternative is to wait for this transaction to commit and then updates right away. "Winning" could be a Pyrrhic victory.
About "empty updates":
OK, you asked for it:
WITH data(id) AS ( -- Only 1st column gets explicit name! VALUES -- rest gets default names "column2", etc. (1, 'foo_upd', NULL) -- changed , (2, 'bar', NULL) -- unchanged , (3, 'baz', NULL) -- new , (4, 'baz', NULL) -- new ) , ups AS ( INSERT INTO tbl AS t TABLE data -- short for: SELECT * FROM data ON CONFLICT (id) DO UPDATE SET id = t.id WHERE false -- never executed, but locks the row! RETURNING t.id ) , del AS ( DELETE FROM tbl AS t USING data d LEFT JOIN ups u USING (id) WHERE u.id IS NULL -- not inserted ! AND t.id = d.id -- AND t <> d -- avoid empty updates - only for full rows RETURNING t.id ) , ins AS ( INSERT INTO tbl AS t SELECT * FROM data JOIN del USING (id) -- conflict impossible! RETURNING id ) SELECT ARRAY(TABLE ups) AS inserted -- with UPSERT , ARRAY(TABLE ins) AS updated -- with DELETE & INSERT;
How?
data
just provides data. Could be a table instead.ups
: UPSERT. Rows with conflicting id
are not changed, but also locked.del
deletes conflicting rows. They remain locked.ins
inserts whole rows. Only allowed for the same transactionTo check for empty updates test (before and after) with:
SELECT ctid, * FROM tbl; -- did the ctid change?
The (commented out) check for any changes in the row AND t <> d
works even with NULL values because we are comparing two typed row values according to the manual:
two NULL field values are considered equal, and a NULL is considered larger than a non-NULL
This works for a subset of leading columns too, preserving existing values.
The trick is to let Postgres build the query string with column names from the system catalogs dynamically, and then execute it.
See related answers for code:
Update multiple columns in a trigger function in plpgsql
Bulk update of all columns
SQL update fields of one table from fields of another one
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With