I have a problem with ON CONFLICT DO UPDATE
in Postgres 9.5 when I try to use more than one source in the FROM statement.
Example of working code:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
Faulty code:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference", old."ReferenceAuthor"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
AND old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID"
--Year, Title and Author must be present in the data, otherwise the entry is deemed useless, hence won't be included
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
I added an additional source in the FROM statement and one more WHERE statement to make sure only entries that have a title, year and author are inserted into the new database. (If old."Reference"."ID" exists in old."ReferenceAuthor" as "ReferenceID", then an author exists.) Even without the additional WHERE statement the query is faulty. The columns I specified in SELECT are only present in old."Reference"
, not in old."ReferenceAuthor"
.
Currently old."ReferenceAuthor"
and old."Reference"
don't have a UNIQUE CONSTRAINT,the uniqe constraints for bookmonographs are:
CONSTRAINT bookmonographs_pk PRIMARY KEY (bookmonographsid),
CONSTRAINT bookmonographs_bookseries FOREIGN KEY (bookseriesid)
REFERENCES new.bookseries (bookseriesid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT bookmonographs_citaviid_unique UNIQUE (citavi_id)
The error PSQL throws:
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values. ********** Error **********
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time SQL state: 21000 Hint: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
I don't know what's wrong, or why the hint points to a duplicated constrained value.
The problem is caused by the fact that apparently some entries have multiple authors. So the inner join in the select query that you wrote will return multiple rows for the same entry and INSERT ... ON CONFLICT
doesn't like that. Since you only use the ReferenceAuthor
table for filtering, you can simply rewrite the query so that it uses that table to only filter entries that don't have any author by doing an exists
on a correlated subquery. Here's how:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
WHERE old."Reference"."ReferenceType" = 'Book'
AND old."Reference"."Year" IS NOT NULL
AND old."Reference"."Title" IS NOT NULL
AND exists(SELECT FROM old."ReferenceAuthor" WHERE old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID")
--Year, Title and Author must be present in the data, otherwise the entry is deemed useless, hence won't be included
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) = (excluded.abstract, excluded.createdon, excluded.edition, excluded.title, excluded.year)
;
Use an explicit INNER JOIN
to join the two source tables together:
INSERT INTO new.bookmonographs (citavi_id, abstract, createdon, edition, title, year)
SELECT "ID", "Abstract", "CreatedOn"::timestamp, "Edition", "Title", "Year"
FROM old."Reference"
INNER JOIN old."ReferenceAuthor" -- explicit join
ON old."ReferenceAuthor"."ReferenceID" = old."Reference"."ID" -- ON condition
WHERE old."Reference"."ReferenceType" = 'Book' AND
old."Reference"."Year" IS NOT NULL AND
old."Reference"."Title" IS NOT NULL
ON CONFLICT (citavi_id) DO UPDATE
SET (abstract, createdon, edition, title, year) =
(excluded.abstract, excluded.createdon, excluded.edition, excluded.title,
excluded.year)
There's a great explanation of the issue in postgres' docs (ctrl + f
: "Cardinality violation" errors in detail
, as there's no direct link).
To quote from the docs:
The idea of raising "cardinality violation" errors is to ensure that any one row is affected no more than once per statement executed. In the lexicon of the SQL standard's discussion of SQL MERGE, the SQL statement is "deterministic". The user ought to be confident that a row will not be affected more than once - if that isn't the case, then it isn't predictable what the final value of a row affected multiple times will be.
To replay their simpler example, on table upsert
the below query could not work, as we couldn't reliably know if select val from upsert where key = 1
was equal to 'Foo' or 'Bar':
INSERT INTO upsert(key, val)
VALUES(1, 'Foo'), (1, 'Bar')
ON CONFLICT (key) UPDATE SET val = EXCLUDED.val;
ERROR: 21000: ON CONFLICT UPDATE command could not lock/update self-inserted tuple
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With