Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Executing a trigger AFTER the completion of a transaction

In PostgreSQL, are DEFERRED triggers executed before (within) the completion of the transaction or just after it?

The documentation says:

DEFERRABLE
NOT DEFERRABLE

This controls whether the constraint can be deferred. A constraint that is not deferrable will be checked immediately after every command. Checking of constraints that are deferrable can be postponed until the end of the transaction (using the SET CONSTRAINTS command).

It doesn't specify if it is still inside the transaction or out. My personal experience says that it is inside the transaction and I need it to be outside!

Are DEFERRED (or INITIALLY DEFERRED) triggers executed inside of the transaction? And if they are, how can I postpone their execution to the time when the transaction is completed?

To give you a hint what I'm after, I'm using pg_notify and RabbitMQ (PostgreSQL LISTEN Exchange) to send out messages. I process such messages in an external application. Right now I have a trigger which notifies the external app of the newly inserted records by including the record's id in the message. But in a non-deterministic way, once in a while, when I try to select a record by its id at hand, the record can not be found. That's because the transaction is not complete yet and the record is not actually added to the table. If I can only postpone the execution of the trigger for after the completion of the transaction, everything will work out.

In order to get better answers let me explain the situation even closer to the real world. The actual scenario is a little more complicated than what I explained before. The source code can be found here if anyone's interested. Becuase of reasons that I'm not gonna dig into, I have to send the notification from another database so the notification is actually sent like:

PERFORM * FROM dblink('hq','SELECT pg_notify(''' || channel || ''', ''' || payload || ''')');

Which I'm sure makes the whole situation much more complicated.

like image 863
Mehran Avatar asked Aug 30 '16 04:08

Mehran


2 Answers

Triggers (including all sorts of deferred triggers) fire inside the transaction.

But that is not the problem here, because notifications are delivered between transactions anyway.

The manual on NOTIFY:

NOTIFY interacts with SQL transactions in some important ways. Firstly, if a NOTIFY is executed inside a transaction, the notify events are not delivered until and unless the transaction is committed. This is appropriate, since if the transaction is aborted, all the commands within it have had no effect, including NOTIFY. But it can be disconcerting if one is expecting the notification events to be delivered immediately. Secondly, if a listening session receives a notification signal while it is within a transaction, the notification event will not be delivered to its connected client until just after the transaction is completed (either committed or aborted). Again, the reasoning is that if a notification were delivered within a transaction that was later aborted, one would want the notification to be undone somehow — but the server cannot "take back" a notification once it has sent it to the client. So notification events are only delivered between transactions. The upshot of this is that applications using NOTIFY for real-time signaling should try to keep their transactions short.

Bold emphasis mine.

pg_notify() is just a convenient wrapper function for the SQL NOTIFY command.

If some rows cannot be found after a notification has been received, there must be a different cause! Go find it. Likely candidates:

  • Concurrent transactions interfering
  • Triggers doing something more or different than you think they do.
  • All sorts of programming errors.

Either way, like the manual suggests, keep transactions that send notifications short.

dblink

Update: Transaction control in a PROCEDURE or DO statement in Postgres 11 or later makes this a lot simpler. Just COMMIT; to (also) send waiting notifications.


Original answer (mostly for Postgres 10 or older):

PERFORM * FROM dblink('hq','SELECT pg_notify(''' || channel || ''', ''' || payload || ''')');

... which should be rewritten with format() to simplify and make the syntax secure:

PRERFORM dblink('hq', format('NOTIFY %I, %L', channel, payload));

dblink is a game-changer here, because it opens a separate transaction in the other database. This is sometimes used to fake autonomous transaction.

  • Does Postgres support nested or autonomous transactions?

  • How do I do large non-blocking updates in PostgreSQL?

dblink() waits for the remote command to finish. So the remote transaction will most probably commit first. The manual:

The function returns the row(s) produced by the query.

If you can send notification from the same transaction instead, that would be a clean solution.

Workaround for dblink

If notifications have to be sent from a different transaction, there is a workaround with dblink_send_query():

dblink_send_query sends a query to be executed asynchronously, that is, without immediately waiting for the result.

DO  -- or plpgsql function
$$
BEGIN
   -- do stuff

   PERFORM dblink_connect   ('hq',   'your_connstr_or_foreign_server_here');
   PERFORM dblink_send_query('con1', format('SELECT pg_sleep(3); NOTIFY %I, %L ', 'Channel', 'payload'));
   PERFORM dblink_disconnect('con1');
END
$$;

If you do this right before the end of the transaction, your local transaction gets 3 seconds (pg_sleep(3)) head start to commit. Chose an appropriate number of seconds.

There is an inherent uncertainty to this approach, since you get no error message if anything goes wrong. For a secure solution you need a different design. After successfully sending the command, chances for it to still fail are extremely slim, though. The chance that successful notifications are missed seem much higher, but that's built into your current solution already.

Safe alternative

A safer alternative would be to write to a queue table and poll it like discussed in @Bohemian's answer. This related answer demonstrates how to poll safely:

  • Postgres UPDATE … LIMIT 1
like image 176
Erwin Brandstetter Avatar answered Oct 31 '22 16:10

Erwin Brandstetter


I'm posting this as an answer, assuming the actual problem you are trying to solve is deferring execution of an external process until after the transaction is completed (rather than the X-Y "problem" you're trying to solve using trigger Kung Fu).

Having the database tell an app to do something is a broken pattern. It's broken because:

  1. There's no fallback if the app doesn't get the message, eg because it's down, network explodes, whatever. Even the app replying with an acknowledgment (which it can't), wouldn't fix this problem (see next point)
  2. There's no sensible way to retry the work if the app gets the message but fails to complete it (for any of lots of reasons)

In contrast, using the database as a persistant queue, and having the app poll it for work, and take the work off the queue when work is complete, has none of the above problems.

There are lots of ways to achieve this. The one I prefer is to have some process (usually trigger on insert, update and delete) put data into a "queue" table. Have another process poll that table for work to do, and delete from the table when work is complete.

It also adds some other benefits:

  • The production and consumption of work is decoupled, which means you can safely kill and restart your app (which must happen from time to time, eg deploying) - the queue table will happily grow while the app is down, and will drain when the app is back up. You can even replace the app with an entirely new one
  • If for whatever reason you want to initiate processing of certain items, you can just manually insert rows into the queue table. I used this technique myself to initiate the processing of all items in a database that needed initialising by being put on the queue once. Importantly, I didn't need to do a perfunctory update to every row just to fire the trigger
  • Getting to your question, a slight delay can be introduced by adding a timestamp column to the queue table and having the poll query only select rows that are older than (say) 1 second, which gives the database time to complete its transaction
  • You can't overload the app. The app will read only as much work as it can handle. If your queue is growing, you need a faster app, or more apps If multiple consumers are operating, concurrency can be solved by (for example) adding a "token" column to the queue table

Queues that are backed by database tables is the basis of how persistent queues are implemented in commercial grade queue-based platforms, so the pattern is well tested, used and understood.

Leave the database to do what it does best, and the only thing it does well: Manage data. Don't try to make your database server into an app server.

like image 36
Bohemian Avatar answered Oct 31 '22 15:10

Bohemian