Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use triggers to prevent duplicate records in PostgreSQL?

I wish to create a stored procedure (in plpgsql, PostgreSQL 9.1) that first checks to be sure that the record which is going to be inserted is unique on four of its columns, or if a record is updated, that it is updated to unique values.

  Example:
    Record (1,2,3,4) is to be inserted.
    If Record (1,2,3,4) already exists, then do not insert a duplicate record.
    if Record (1,2,3,4) does not exist, then insert it.

    Record (1,2,3,4) is to be updated to (5,6,7,8).
    If Record (5,6,7,8) already exists, then do not update the record. (duplicate record not allowed).
    If Record (5,6,7,8) does not exist, then update the record to the new values.

I previously had used a unique index on the record's fields, but would like to learn how a trigger is written to accomplish this.

like image 988
Alan Wayne Avatar asked Apr 28 '15 18:04

Alan Wayne


People also ask

How you will avoid duplicating records in a query?

The SQL DISTINCT keyword, which we have already discussed is used in conjunction with the SELECT statement to eliminate all the duplicate records and by fetching only the unique records.


1 Answers

I previously had used a unique index on the record's fields, but would like to learn how a trigger is written to accomplish this.

This is a misunderstanding. If a set of columns is supposed to be unique, use a UNIQUE constraint (or make it the PK) in any case. And be aware of a special role for NULL values:

  • Composite PRIMARY KEY enforces NOT NULL constraints on involved columns
  • Create unique constraint with null columns

The rest of the answer is largely outdated. Since Postgres 9.5 added UPSERT there is a simpler solution:

INSERT INTO tbl (col1, col2, col3, col4)
VALUES (1, 2, 3, 4)
ON     CONFLICT ON CONSTRAINT my_4_col_uni DO NOTHING;

The rest is for Postgres 9.4 or older

Triggers can help to enforce the constraint. But they fail to enforce uniqueness on their own due to inherent race conditions.

You can just let the unique constraint handle duplicate keys. You'll get an EXCEPTION for violations. To avoid exceptions most of the time1 you can use a simple trigger:

CREATE OR REPLACE FUNCTION tbl_ins_up_before()
  RETURNS trigger AS
$func$
BEGIN

IF EXISTS (SELECT 1 FROM tbl
           WHERE (col1,     col2,     col3,     col4)
           = (NEW.col1, NEW.col2, NEW.col3, NEW.col4)) THEN
   RETURN NULL;
END IF;

RETURN NEW;

END
$func$  LANGUAGE plpgsql;

CREATE TRIGGER ins_up_before
BEFORE INSERT OR UPDATE OF col1, col2, col3, col4  -- fire only when relevant
ON tbl
FOR EACH ROW EXECUTE PROCEDURE tbl_ins_up_before();

1There is an inherent race condition in the time slice between checking if a row already exists and actually inserting the row, which cannot be avoided unless you lock the table exclusively (very expensive). Details depend on the exact definition of your constraint (may be deferrable). So you might still get an exception if a concurrent transaction also finds (at virtually the same moment) that (1,2,3,4) is not there yet and inserts before you. Or the operation might get aborted, but the existing row is deleted before you can commit.

This cannot be fixed with row-level locking either, because you cannot lock rows that aren't there yet (predicate locking) in Postgres up to version 9.6.

You need a unique constraint, which enforces uniqueness at all times.

I would have the constraint and then use this query:

INSERT INTO tbl (col1, col2, col3, col4)
SELECT 1, 2, 3, 4
WHERE  NOT EXISTS (
   SELECT 1 FROM tbl
   WHERE (col1, col2, col3, col4) = (1, 2, 3, 4);

Similar for UPDATE.

You could encapsulate INSERT / UPDATE in a plpgsql function and trap duplicate key violations. Example:

  • Is SELECT or INSERT in a function prone to race conditions?
like image 131
Erwin Brandstetter Avatar answered Sep 19 '22 19:09

Erwin Brandstetter