Why doesn't this rule prevent duplicate key violations?

Tags:

(postgresql) I was trying to COPY csv data into a table but I was getting duplicate key violation errors, and there's no way to tell COPY to ignore those, so following internet wisdom I tried adding this rule:

CREATE OR REPLACE RULE ignore_duplicate_inserts AS
   ON INSERT TO mytable
   WHERE (EXISTS ( SELECT mytable.id
           FROM mytable
          WHERE mytable.id = new.id)) DO NOTHING;

to circumvent the problem, but I still get those errors - any ideas why ?

820

asked Mar 11 '11 06:03

Hendekagon

2 Answers

Rules by default add things to the current action:

Roughly speaking, a rule causes additional commands to be executed when a given command on a given table is executed.

But an INSTEAD rule allows you to replace the action:

Alternatively, an INSTEAD rule can replace a given command by another, or cause a command not to be executed at all.

So, I think you want to specify INSTEAD:

CREATE OR REPLACE RULE ignore_duplicate_inserts AS
   ON INSERT TO mytable
   WHERE (EXISTS ( SELECT mytable.id
           FROM mytable
          WHERE mytable.id = new.id)) DO INSTEAD NOTHING;

Without the INSTEAD, your rule is essentially saying "do the INSERT and then do nothing" when you want to say "instead of the INSERT, do nothing" and, AFAIK, the DO INSTEAD NOTHING will do that.

I'm not an expert on PostgreSQL rules but I think adding the "INSTEAD" should work.

UPDATE: Thanks to araqnid we know that:

COPY FROM will invoke any triggers and check constraints on the destination table. However, it will not invoke rules

So a rule isn't going to work in this situation. However, triggers are fired during COPY FROM so you could write a BEFORE INSERT trigger that would return NULL when it detected duplicate rows:

It can return NULL to skip the operation for the current row. This instructs the executor to not perform the row-level operation that invoked the trigger (the insertion or modification of a particular table row).

That said, I think you'd be better off with araqnid's "load it all into a temporary table, clean it up, and copy it to the final destination" would be a more sensible solution for a bulk loading operation like you have.

126

answered Sep 18 '22 20:09

mu is too short

COPY FROM will not invoke rules (http://www.postgresql.org/docs/9.0/interactive/sql-copy.html#AEN58860)

My approach would be to load the CSV data into a temp table, then use an INSERT...SELECT statement to copy the data into the target table where it doesn't already exist. (If there are duplicates in the CSV data itself, remove those from the temp table first). Something like:

BEGIN;
CREATE TEMP TABLE stage_data(key_column, data_columns...) ON COMMIT DROP;
\copy stage_data from data.csv with csv header
-- prevent any other updates while we are merging input (omit this if you don't need it)
LOCK target_data IN SHARE ROW EXCLUSIVE MODE;
-- insert into target table
INSERT INTO target_data(key_column, data_columns...)
   SELECT key_column, data_columns...
   FROM stage_data
   WHERE NOT EXISTS (SELECT 1 FROM target_data
                     WHERE target_data.key_column = stage_data.key_column)
END;

answered Sep 17 '22 20:09

araqnid

Related questions
                            
                                Output "yes/no" instead of "t/f" for boolean data type in PostgreSQL
                            
                                Question mark operator in query
                            
                                What happens if we kill the JVM process during transaction execution?
                            
                                Convert table into array of custom types
                            
                                Handling Postgres json datatype in slick, scala
                            
                                Performance of DELETE with NOT IN (SELECT ...)
                            
                                Indexing data from postgres to solr/elasticsearch
                            
                                Default connection pool for tomcat in spring-boot?
                            
                                How To change or alter column type in Views using PostgreSQL?
                            
                                Missing hstore extension
                            
                                postgres syntax error at or near "ON"
                            
                                Postgresql - how to run a query on multiple tables with same schema
                            
                                Connecting psycopg2 with Python in Heroku
                            
                                Postgresql - How to speed up for updating huge table(100 million rows)?
                            
                                DISTINCT ON() in jOOQ
                            
                                TIMESTAMP vs TIMESTAMPTZ correct value when taking time difference
                            
                                How to change a table ID from serial to identity?
                            
                                How to list all constraints of a table in PostgreSQL?
                            
                                pg_config, ruby pg, postgresql 9.0 problem after upgrade, centos 5
                            
                                SQL 'LIKE' syntax

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why doesn't this rule prevent duplicate key violations?

Tags:

copy

duplicates

postgresql

insert

rules

Hendekagon

People also ask

2 Answers

mu is too short

araqnid

Recent Activity

Donate For Us