Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database Design with Change History

I am looking to design a database that keeps track of every set of changes so that I can refer back to them in the future. So for example:

Database A   +==========+========+==========+ |   ID     |  Name  | Property |       1        Kyle      30 

If I change the row's 'property' field to 50, it should update the row to:

1    Kyle    50 

But should save the fact that the row's property was 30 at some point in time. Then if the row is again updated to be 70:

1    Kyle    70 

Both facts that the row's property was 50 and 70 should be preserved, such that with some query I could retrieve:

1    Kyle    30 1    Kyle    50 

It should recognize that these were the "same entries" just at different points in time.

Edit: This history will need to be presented to the user at some point in time so ideally, there should be an understanding of which rows belong to the same "revision cluster"

What is the best way to approach the design of this database?

like image 386
Delos Chang Avatar asked Jun 12 '13 21:06

Delos Chang


People also ask

How do you keep track of history in a database?

One simple way to keep version history is to create basically an identical table (eg. with _version suffix). Both of the tables would have a version field, which for the main table you increment for every update you do. The version table would have a composite primary key on (id, version).

How do you keep track of database changes?

At the basic database level you can track changes by having a separate table that gets an entry added to it via triggers on INSERT/UPDATE/DELETE statements. Thats the general way of tracking changes to a database table. The other thing you want is to know which user made the change.

What is historical data in database?

Historical data, in a broad context, is collected data about past events and circumstances pertaining to a particular subject. By definition, historical data includes most data generated either manually or automatically within an enterprise.


1 Answers

One way is to have a MyTableNameHistory for every table in your database, and make its schema identical to the schema of table MyTableName, except that the Primary Key of the History table has one additional column named effectiveUtc as a DateTime. For example, if you have a table named Employee,

Create Table Employee {   employeeId integer Primary Key Not Null,   firstName varChar(20) null,   lastName varChar(30) Not null,   HireDate smallDateTime null,   DepartmentId integer null } 

Then the History table would be

Create Table EmployeeHistory {   employeeId integer Not Null,   effectiveUtc DateTime Not Null,   firstName varChar(20) null,   lastName varChar(30) Not null,   HireDate smallDateTime null,   DepartmentId integer null,   Primary Key (employeeId , effectiveUtc) } 

Then, you can put a trigger on Employee table, so that every time you insert, update, or delete anything in the Employee table, a new record is inserted into the EmployeeHistory table with the exact same values for all the regular fields, and current UTC datetime in the effectiveUtc column.

Then to find the values at any point in the past, you just select the record from the history table whose effectiveUtc value is the highest value prior to the asOf datetime you want the value as of.

 Select * from EmployeeHistory h  Where EmployeeId = @EmployeeId    And effectiveUtc =     (Select Max(effectiveUtc)      From EmployeeHistory       Where EmployeeId = h.EmployeeId         And effcetiveUtc < @AsOfUtcDate)  
like image 87
Charles Bretana Avatar answered Oct 04 '22 18:10

Charles Bretana