SQL schema pattern for keeping history of changes

Tags:

database-schema

Consider a database that maintains a list of persons and their contact information, including addresses and such.

Sometimes, the contact information changes. Instead of simply updating the single person record to the new values, I like to keep a history of the changes.

I like to keep the history in a way that when I look at a person's record, I can quickly determine that there are older recordings of that person's data as well. However, I also like to avoid having to build very complicated SQL queries for retrieving only the latest version of each person's records (while this may be easy with a single table, it quickly gets difficult once the table is connected to other tables).

I've come up with a few ways, which I'll add below as answers, but I wonder if there are better ways (While I'm a seasoned code writer, I'm rather new to DB design, so I lack the experience and already ran into a few dead ends).

Which DB? I am currently using sqlite but plan to move to a server based DB engine eventually, probably Postgres. However, I meant this question asked in a more general form, not specific to any particular engine, though suggestions how to solve this in certain engines are appreciated, too, in the general interest.

870

asked Oct 02 '15 11:10

Thomas Tempelmann

2 Answers

This is generally referred to as Slowly Changing Dimension and linked Wikipedia page offers several approaches to make this thing work.

Martin Fowler has a list of Temporal Patterns that are not exactly DB-specific, but offer a good starting point.

And finally, Microsoft SQL Server offers Change Data Capture and Change Tracking.

141

answered Sep 22 '22 04:09

Anton Gogolev

Quite often, the history of changes does not have to be structured, because the history is needed for auditing purposes only, and there is no actual need to be able to perform queries against the historical data. So, what quite often suffices is to simply log each modification that is made to the database, for which you only need a log table with a date-time field and some variable length text field into which you can format human-readable messages as to who changed what, and what the old value was, and what the new value is. Nothing needs to be added to the actual data tables, and no additional complexity needs to be added to the queries.

If you must keep historical information in the database so as to be able to execute queries against it, then I would recommend using views. Rename each table from "NAME" to "NAME_HISTORY" and then create a view called "NAME" which presents to you only the latest records. It is okay if code which modifies the table is burdened by having to refer to the table as "NAME_HISTORY" instead of "NAME", because that code will presumably also have to take account of the fact that it is not updating the table, it is appending new historical records to it. As a matter of fact, the use of views will prevent you from accidentally modifying a table without taking care of historicity, and that's a good thing.

answered Sep 23 '22 04:09

Mike Nakis

Related questions
                            
                                How to import text files with the same name and schema but different directories into database?
                            
                                INSERT INTO ... RETURNING multiple columns (PostgreSQL)
                            
                                Oracle: Fastest Way to Extract Filename Extension Using SQL or PL/SQL
                            
                                SQL Server intellisense does not refresh stored procedure and function list
                            
                                Optimize groupwise maximum query
                            
                                Get Root parent of child in Hierarchical table
                            
                                SQL Comparing values in two rows
                            
                                To use multiple with statement by union all
                            
                                Remove newlines in oracle sql
                            
                                PostgreSQL ltree find all ancestors of a given label (not path)
                            
                                Cassandra: Only EQ and IN relation are supported on the partition key (unless you use the token() function)
                            
                                MySQL - Left Join, Select all columns left, few columns on the right tables,
                            
                                Updating multiple rows using list of Ids
                            
                                How to join two tables based on substring values of fields?
                            
                                REPLACE empty string
                            
                                SQL Server FOR XML Path make repeating nodes
                            
                                "Operator does not exist: integer =?" when using Postgres
                            
                                Left join without multiple rows from right table
                            
                                Using SqlQuery<Dictionary<string, string>> in Entity Framework 6
                            
                                How can I get a count of a bit-type column?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With