Consider a database that maintains a list of persons and their contact information, including addresses and such.
Sometimes, the contact information changes. Instead of simply updating the single person record to the new values, I like to keep a history of the changes.
I like to keep the history in a way that when I look at a person's record, I can quickly determine that there are older recordings of that person's data as well. However, I also like to avoid having to build very complicated SQL queries for retrieving only the latest version of each person's records (while this may be easy with a single table, it quickly gets difficult once the table is connected to other tables).
I've come up with a few ways, which I'll add below as answers, but I wonder if there are better ways (While I'm a seasoned code writer, I'm rather new to DB design, so I lack the experience and already ran into a few dead ends).
Which DB? I am currently using sqlite but plan to move to a server based DB engine eventually, probably Postgres. However, I meant this question asked in a more general form, not specific to any particular engine, though suggestions how to solve this in certain engines are appreciated, too, in the general interest.
SQL Server 2016 introduced a new feature, Temporal Tables, which allow you to keep a historical record of all of the versions of each row in a table. As rows get introduced, changed, and deleted over time, you can always see what the table looked like during a certain time period or at a specific point in time.
A schema change is an alteration made to a collection of logical structures (or schema objects) in a database. Schema changes are generally made using structured query language (SQL) and are typically implemented during maintenance windows.
To see schema changes in the server right click connection element in the object explorer and choose Reports > Standard Reports > Schema Changes History from context menu.
This is generally referred to as Slowly Changing Dimension and linked Wikipedia page offers several approaches to make this thing work.
Martin Fowler has a list of Temporal Patterns that are not exactly DB-specific, but offer a good starting point.
And finally, Microsoft SQL Server offers Change Data Capture and Change Tracking.
Quite often, the history of changes does not have to be structured, because the history is needed for auditing purposes only, and there is no actual need to be able to perform queries against the historical data. So, what quite often suffices is to simply log each modification that is made to the database, for which you only need a log table with a date-time field and some variable length text field into which you can format human-readable messages as to who changed what, and what the old value was, and what the new value is. Nothing needs to be added to the actual data tables, and no additional complexity needs to be added to the queries.
If you must keep historical information in the database so as to be able to execute queries against it, then I would recommend using views. Rename each table from "NAME" to "NAME_HISTORY" and then create a view called "NAME" which presents to you only the latest records. It is okay if code which modifies the table is burdened by having to refer to the table as "NAME_HISTORY" instead of "NAME", because that code will presumably also have to take account of the fact that it is not updating the table, it is appending new historical records to it. As a matter of fact, the use of views will prevent you from accidentally modifying a table without taking care of historicity, and that's a good thing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With