Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write an upsert trigger in PostgreSQL?

In PostgreSQL 9.6 and later, what is the correct way to define a trigger function that will perform an update whenever an insert would fail because of a uniqueness constraint?

I know it is straightforward to write insert ... on conflict ... do update set ... statements, but my idea is that I want to have some tables that treat repeated inserts as updates; otherwise that piece of logic would have to be taken care of by the application, not the DB.

One solution that I have found and that superficially does work is this:

create table versions (
  key           text primary key,
  version       text );

/* ### TAINT not sure whether there may be race conditions with this upsert trigger */
create function on_before_insert_versions() returns trigger language plpgsql volatile as $$ begin
  if exists ( select 1 from versions where key = new.key ) then
    update versions set version = new.version where key = new.key;
    return null;
    end if;
  return new;
  end; $$;

create trigger on_before_insert_versions
  before insert on versions for each row execute procedure on_before_insert_versions();

insert into versions values
  ( 'server', '3.0.3' ),
  ( 'api',    '2' );

insert into versions values
  ( 'api',    '3' );

select * from versions;

  key   | version 
--------+---------
 server | 3.0.3
 api    | 3

However, isn't the trigger prone to race conditions? I tried to use an insert ... on conflict ... do update set ... statement in the trigger, but that failed of course because it triggers the trigger function itself, leading to infinite regress.

I also tried to use a pair of alter table ... disable trigger ... / enable statements, but that errors with cannot ALTER TABLE ... because it is being used by active queries in this session.

What is the canonical form to always perform an update instead of an insert on uniqueness constraint violation in PostgreSQL?

Update—Upserts in PostgreSQL, or their long-time absence, are somehwat of a hot topic, and many not-so-perfect solutions are regularly suggested.

Given that the Postgres maintainers have taken so much time and care to make insert ... on conflict .. do update work without race conditions, it is perhaps unwise to embrace a homebrewed solution that 'seems to work' (until it doesn't).

When I wrote my question, I was adamant to have an insert trigger that does an update on conflict; this is not well supported by PostgreSQL, the chief problem being that the insert that you perform on the same table within a before insert trigger will cause that same trigger to be called. @Laurenz Albe suggested how to escape from the infinite loop, and while the proposed technique (ingenious!) looks like a good thing to remember, we do not know about possible impacts on performance or other side effects.

In then end, @Ilya Dyoshin hit the nail by proposing to just call a function from the application that wraps the necessary SQL logic. I feel this is a win/win solution because

1) it does not change the semantics of insert into x for table x to 'really mean update, sometimes';

2) the 'upsert semantics' are made explicit in the application code, but not spelled out in gory detail;

3) you can still do an insert without intending an implicit 'update'—in hindsight, this is perhaps the most important consideration.

like image 679
John Frazer Avatar asked Nov 29 '17 21:11

John Frazer


2 Answers

I agree with Ilya that it would be better to do this in the application in a straightforward way.

But I take it in the spirit of a thought experiment, and my solution uses the power of pg_trigger_depth() to escape endless recursion:

CREATE OR REPLACE FUNCTION on_before_insert_versions() RETURNS trigger
   LANGUAGE plpgsql AS
$$BEGIN
   IF pg_trigger_depth() = 1 THEN
      INSERT INTO versions (key, version) VALUES (NEW.key, NEW.version)
         ON CONFLICT (key)
         DO UPDATE SET version = NEW.version;
      RETURN NULL;
   ELSE
      RETURN NEW;
   END IF;
END;$$;

Your solution is definitely vulnerable to a race condition: two concurrent INSERTs can lead to concurrently running triggers, both of which cannot find a matching row in versions and consequently lead to an INSERT, one of which must fail.

like image 82
Laurenz Albe Avatar answered Oct 01 '22 00:10

Laurenz Albe


it is better to use pure upsert.

otherwise you can introduce more complicated logic and don't return the data for insert from the trigger (read the docs = if trigger before insert is not returning value no insert is performed)

like image 34
Ilya Dyoshin Avatar answered Oct 01 '22 00:10

Ilya Dyoshin