Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How list all tables with data changes in the last 24 hours?

We had an ugly problem, by mistake, a balancer redirect some requests to a test instance with pretty similar data than production, now I know that there are data recorded in the test Postgres that belongs to production

Is there a way to list all the tables with data changes in the last 24 hours in Postgres?

Postgres version is 9.3 and I have around 250 tables.

like image 734
martinkenneth Avatar asked Sep 10 '15 22:09

martinkenneth


People also ask

How do I find the latest updated data in a table in SQL?

To have the latest updated records, we should have a column such as “last updated” with the “Timestamp” column in any table and when a record is newly inserted, it should get the current timestamp value for that column.

How do I track changes in database?

SQL Server provides two features that track changes to data in a database: change data capture and change tracking. These features enable applications to determine the DML changes (insert, update, and delete operations) that were made to user tables in a database.


1 Answers

First consider my comment.

Postgres up to and including 9.4 does not by itself record timestamps when rows were inserted or updated.

There are some system columns in the row headers that can help in the forensic work. The physical order of rows (ctid) can be an indicator if nothing else has happened to the table since. In simple cases new rows are appended to the physical end of a table when inserted, so the ctid indicates what was inserted last - until anything changes in the table. Postgres is free to rearrange the physical order of rows at will, for instance with VACUUM. Any UPDATE also writes a new row version, which can change the physical position. The new version does not have to be at the end of the table. Postgres tries to keep new row version on the same data page if possible (HOT update) ...

That said, here is a simple query to get the physically last rows for a given table:

SELECT ctid, *
FROM   big
ORDER  BY ctid DESC
LIMIT  10;

Related answers on dba.SE with detailed information:

  • VACUUM returning disk space to operating system
  • How do I decompose ctid into page and row numbers?

The insert transaction id xmin can be useful:

  • How to find out when data was inserted to Postgres?

If you happen to have a backup for the test DB from right before the incident, that would be helpful. Restore the old state to a separate schema of the test DB and compare tables ...

Typically, I add one or two timestamptz columns to important tables for when the row was inserted, and / or when it was updated the last time. That would be tremendously useful for you right now ...

What would also be great for you: the "temporal" features introduced in the SQL standard with SQL:2011. But that's not implemented in Postgres, yet.
There's a page in the Postgres Wiki.
There is also an unofficial extension on PGXN. I have not tested it and can't say how far it is.

Postgres 9.5 introduces a feature to record commit timestamps (like @Craig commented). Needs to be enabled manually before it starts recording. The manual:

track_commit_timestamp (bool)

Record commit time of transactions. This parameter can only be set in postgresql.conf file or on the server command line. The default value is off.

And some functions to work with it.

like image 102
Erwin Brandstetter Avatar answered Sep 22 '22 17:09

Erwin Brandstetter