I am running a python script that inserts a large amount of data into a Postgres database, I use a single query to perform multiple row inserts:
INSERT INTO table (col1,col2) VALUES ('v1','v2'),('v3','v4') ... etc
I was wondering what would happen if it hits a duplicate key for the insert. Will it stop the entire query and throw an exception? Or will it merely ignore the insert of that specific row and move on?
Duplicate rows are a fundamental part of the SQL model of data because the SQL language doesn't really try to implement the relational algebra. SQL uses a bag (multiset)-based algebra instead.
Note − Use the INSERT IGNORE command rather than the INSERT command. If a record doesn't duplicate an existing record, then MySQL inserts it as usual. If the record is a duplicate, then the IGNORE keyword tells MySQL to discard it silently without generating an error.
Use the context menu on the same table, to get another script: "Script Table as | SELECT To | New Query Window". This will be a totally standard select list, with all your fields listed out. Copy the whole query and paste it in over the VALUES clause in your first query window. This will give you a complete INSERT ...
The INSERT
will just insert all rows and nothing special will happen, unless you have some kind of constraint disallowing duplicate / overlapping values (PRIMARY KEY
, UNIQUE
, CHECK
or EXCLUDE
constraint) - which you did not mention in your question. But that's what you are probably worried about.
Assuming a UNIQUE
or PK constraint on (col1,col2)
, you are dealing with a textbook UPSERT
situation. Many related questions and answers to find here.
Generally, if any constraint is violated, an exception is raised which (unless trapped in subtransaction like it's possible in a procedural server-side language like plpgsql) will roll back not only the statement, but the whole transaction.
I.e.: No other transactions will try to write to the same table at the same time.
Exclude rows that are already in the table with WHERE NOT EXISTS ...
or any other applicable technique:
Select rows which are not present in other table
And don't forget to remove duplicates within the inserted set as well, which would not be excluded by the semi-anti-join WHERE NOT EXISTS ...
One technique to deal with both at once would be EXCEPT
:
INSERT INTO tbl (col1, col2)
VALUES
(text 'v1', text 'v2') -- explicit type cast may be needed in 1st row
, ('v3', 'v4')
, ('v3', 'v4') -- beware of dupes in source
EXCEPT SELECT col1, col2 FROM tbl;
EXCEPT
without the key word ALL
folds duplicate rows in the source. If you know there are no dupes, or you don't want to fold duplicates silently, use EXCEPT ALL
(or one of the other techniques). See:
Generally, if the target table is big, WHERE NOT EXISTS
in combination with DISTINCT
on the source will probably be faster:
INSERT INTO tbl (col1, col2)
SELECT *
FROM (
SELECT DISTINCT *
FROM (
VALUES
(text 'v1', text'v2')
, ('v3', 'v4')
, ('v3', 'v4') -- dupes in source
) t(c1, c2)
) t
WHERE NOT EXISTS (
SELECT FROM tbl
WHERE col1 = t.c1 AND col2 = t.c2
);
If there can be many dupes, it pays to fold them in the source first. Else use one subquery less.
Related:
Use the Postgres UPSERT
implementation INSERT ... ON CONFLICT ...
in Postgres 9.5 or later:
INSERT INTO tbl (col1,col2)
SELECT DISTINCT * -- still can't insert the same row more than once
FROM (
VALUES
(text 'v1', text 'v2')
, ('v3','v4')
, ('v3','v4') -- you still need to fold dupes in source!
) t(c1, c2)
ON CONFLICT DO NOTHING; -- ignores rows with *any* conflict!
Further reading:
Documentation:
Craig's reference answer for UPSERT
problems:
Will it stop the entire query and throw an exception? Yes.
To avoid that, you can look on the following SO question here, which describes how to avoid Postgres from throwing an error for multiple inserts when some of the inserted keys already exist on the DB.
You should basically do this:
INSERT INTO DBtable
(id, field1)
SELECT 1, 'value'
WHERE
NOT EXISTS (
SELECT id FROM DBtable WHERE id = 1
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With