Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I reconstruct a historical view?

I am currently exploring Change Data Capture as an option to store temporal databases. It is great because it stores only the deltas and seems like it may solve my problem. When I enabled CDC, a bunch of tables appeared under System Tables.

When querying cdc.dbo_MyTable, I am able to see all the changes that took place on the table. Now, how would I construct a historical view? For instance, if I wanted to see the state of my table as of a particular date, how would I go about doing that? Is that even possible?

It looks I need to take the log and start applying it over my original table but I was wondering if there is a built-in way of doing this. Any suggestions?

Some of the use cases I am looking at:

  • Know the state of the graph at a particular point in time
  • Given two graphs at different times, know the set of links that are different (this can probably be obtained using an EXCEPT clause after constructing the tables)
like image 389
Legend Avatar asked Aug 13 '12 00:08

Legend


1 Answers

it's possible, but not with a built-in way i'm a afraid. You would have to reconstruct the timeline by hand.

Given that the change-tracking tables offer the tran_end_time, which is the time that the value of the property should be perceived as persisted, you would have to make a query that fetches all the distinct periods of table states, join on the tracked property changes and then pivot (to have a presentation in the same form as the table). Don't forget to union with the table state itself to obtain the values that have not been changed/tracked for completeness.

The final result, simplified, should look like

RN PK PropA   PropB   FromDate          ToDate
1  1  'Ver1'  'Ver1'  2012-01-01 09:00  2012-01-02 08:00
2  1  'Ver1'  'Ver2'  2012-01-02 08:00  2012-01-03 07:00
3  1  'Ver2'  'Ver2'  2012-01-03 07:00  *getdate()*
4  2  'Ver1'  'Ver1'  2012-01-01 05:00  2012-01-02 06:00
5  2  'Ver1'  'Ver2'  2012-01-02 06:00  2012-01-03 01:00
6  2  'Ver2'  'Ver2'  2012-01-03 01:00  *getdate()*

note that the getdate() is valid if the row wasn't deleted in which case it should be substituted with the deletion date

EDIT, for the 2 use cases. The first point is easily addressed it's a matter of constructing the temporal object graph and then filtering:

declare @pointInTime datetime = '20120102 10:00';
select * from Reconstructed_TG where FromDate <= @pointInTime and @pointInTime < ToDate

the second point, can be generated easily with the EXCEPT clause, as you point out. given the above query:

declare @pointInTimeA datetime = '20120102 10:00';
declare @pointInTimeB datetime = '20120103 01:00';
select * from Reconstructed_TG where FromDate <= @pointInTimeA and @pointInTimeA < ToDate
EXCEPT
select * from Reconstructed_TG where FromDate <= @pointInTimeB and @pointInTimeB < ToDate

yet the except clause only presents the rows that have at least one different column value; i don't know if that information is really meaningful to the human eye. Depending on your needs a query that works directly on the cdc data may be more appropriate.

like image 103
Jaguar Avatar answered Nov 11 '22 18:11

Jaguar