Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing incremental client updates with rowversions in Postgres

I am a newcomer to Postgres and love it so far. I've given this problem a lot of thought already, RTFM to the best of my ability, but hit a dead end, so I need a nudge in the right direction.

I'm designing a database where each entity of interest has a rowversion column that gets assigned a value from a global sequence. So, in the simplest scenario, in a table emps with two rows: emp1 with rowversion@3 and emp2 with rowversion@5, I know emp2 was modified after emp1 (ie in a later transaction - don't really mind if rows within the same transaction have the same rowversion).

This is to form the foundation of a data sync logic, where a client that knows they have everything up until @3, can get the latest updates using a query such as SELECT * FROM emps WHERE rowversion>3 and rowversion<=new_anchor.

Here is an example scenario for a client already updated @3 - assume the following transactions since:

@3 - committed
@4 - committed
@5 - committed
@6 - in progress - not committed yet
@7 - committed
@8 - in progress - not committed yet
@9 - committed

Client update is performed in three stages:

  1. Ask the database for an appropriate new_anchor.
  2. Perform SELECT * FROM emps WHERE rowversion>3 and rowversion<=new_anchor.
  3. Pass the new_anchor value back to the client, along with the result data.

Since the rows with rowversion @6 and @8 are still in progress, new_anchor has to be @5, so that our range query doesn't miss any uncommitted updates. Now the client can be confident it has everything up until @5.

So the actual problem distilled: how can this new_anchor be safely determined without forcing SERIALIZABLE or otherwise seriously hurting performance?

As you can probably tell I've borrowed this idea from SQL Server, where this problem is trivially solved by the min_active_rowversion() function. This function would return @6 in the above scenario, so your new_anchor can safely be min_active_rowversion() - 1. I sort of had an idea how this could be implemented in Postgres using an active_rowversions table, triggers, and SELECT min(id) FROM active_rowversions, but that would require READ UNCOMMITTED isolation, which is not available in Postgres.

I would really appreciate any help or ideas.

like image 626
dev Avatar asked Feb 11 '15 00:02

dev


1 Answers

Turns out the solution is much simpler than initially thought, thanks to Postgres' System Information Functions.

  • txid_current() can be used in a trigger to assign a record's rowversion.
  • txid_snapshot_min(txid_current_snapshot()) can be used to get the minimum active transaction in the same way an SQL Server user might use min_active_rowversion().

The best part is these are 64-bit, permanent, non-subject to vacuuming:

These functions export a 64-bit format that is extended with an "epoch" counter so it will not wrap around during the life of an installation.

Postgres is truly amazing.

like image 170
dev Avatar answered Sep 29 '22 08:09

dev