I have 2 tables, Table-A
and Table-A-History
.
Table-A
contains current data rows.Table-A-History
contains historical dataI would like to have the most current row of my data in Table-A
, and Table-A-History
containing historical rows.
I can think of 2 ways to accomplish this:
whenever a new data row is available, move the current row from Table-A
to Table-A-History
and update the Table-A
row with the latest data (via insert into select
or select into table
)
or
whenever a new data row is available, update Table-A
's row and insert a new row into Table-A-History
.
In regards to performance is method 1 or 2 better? Is there a better different way to accomplish this?
Basically you are looking to track/audit changes to a table while keeping the primary table small in size.
There are several ways to solve this issue. The cons and pros of each way is discussed below.
1 - Auditing of the table with triggers.
If you are looking to audit the table (inserts, updates, deletes), look at my how to revent unwanted transactions - SQL Saturday slide deck w/code - http://craftydba.com/?page_id=880. The trigger that fills the audit table can hold information from multiple tables, if you choose, since the data is saved as XML. Therefore, you can un-delete an action if necessary by parsing the XML. It tracks who and what made the change.
Optionally, you can have the audit table on it's own file group.
Description: Table Triggers For (Insert, Update, Delete) Active table has current records. Audit (history) table for non-active records. Pros: Active table has smaller # of records. Index in active table is small. Change is quickly reported in audit table. Tells you what change was made (ins, del, upd) Cons: Have to join two tables to do historical reporting. Does not track schema changes.
2 - Effective dating the records
If you are never going to purge the data from the audit table, why not mark the row as deleted but keep it for ever? Many systems like people soft use effective dating to show if a record is no longer active. In the BI world this is called a type 2 dimensional table (slowly changing dimensions). See the data warehouse institute article. http://www.bidw.org/datawarehousing/scd-type-2/ Each record has a begin and end date.
All active records have a end date of null.
Description: Table Triggers For (Insert, Update, Delete) Main table has both active and historical records. Pros: Historical reporting is easy. Change is quickly shown in main table. Cons: Main table has a large # of records. Index of main table is large. Both active & history records in same filegroup. Does not tell you what change was made (ins, del, upd) Does not track schema changes.
3 - Change Data Capture (Enterprise Feature).
Micorsoft SQL Server 2008 introduced the change data capture feature. While this tracks data change (CDC) using a LOG reader after the fact, it lacks things like who and what made the change. MSDN Details - http://technet.microsoft.com/en-us/library/bb522489(v=sql.105).aspx
This solution is dependent upon the CDC jobs running. Any issues with sql agent will cause delays in data showing up.
See change data capture tables. http://technet.microsoft.com/en-us/library/bb500353(v=sql.105).aspx
Description: Enable change data capture Pros: Do not need to add triggers or tables to capture data. Tells you what change was made (ins, del, upd) the _$operation field in <user_defined_table_CT> Tracks schema changes. Cons: Only available in enterprise version. Since it reads the log after the fact, time delay in data showing up. The CDC tables do not track who or what made the change. Disabling CDC removes the tables (not nice)! Need to decode and use the _$update_mask to figure out what columns changed.
4 - Change Tracking Feature (All Versions).
Micorsoft SQL Server 2008 introduced the change tracking feature. Unlike CDC, it comes with all versions; However, it comes with a bunch of TSQL functions that you have to call to figure out what happened.
It was designed for the purpose of synchronization one data source with SQL server via an application. There is a whole synchronization frame work on TechNet.
http://msdn.microsoft.com/en-us/library/bb933874.aspx http://msdn.microsoft.com/en-us/library/bb933994.aspx http://technet.microsoft.com/en-us/library/bb934145(v=sql.105).aspx
Unlike CDC, you specify how long changes last in the database before being purged. Also, inserts and deletes do not record data. Updates only record what field changed.
Since you are synchronizing the SQL server source to another target, this works fine. It is not good for auditing unless you write a periodic job to figure out changes.
You will still have to store that information somewhere.
Description: Enable change tracking Cons: Not a good auditing solution
The first three solutions will work for your auditing. I like the first solution since I use it extensively in my environment.
Sincerely
John
Code Snippet From Presentation (Autos Database)
-- -- 7 - Auditing data changes (table for DML trigger) -- -- Delete existing table IF OBJECT_ID('[AUDIT].[LOG_TABLE_CHANGES]') IS NOT NULL DROP TABLE [AUDIT].[LOG_TABLE_CHANGES] GO -- Add the table CREATE TABLE [AUDIT].[LOG_TABLE_CHANGES] ( [CHG_ID] [numeric](18, 0) IDENTITY(1,1) NOT NULL, [CHG_DATE] [datetime] NOT NULL, [CHG_TYPE] [varchar](20) NOT NULL, [CHG_BY] [nvarchar](256) NOT NULL, [APP_NAME] [nvarchar](128) NOT NULL, [HOST_NAME] [nvarchar](128) NOT NULL, [SCHEMA_NAME] [sysname] NOT NULL, [OBJECT_NAME] [sysname] NOT NULL, [XML_RECSET] [xml] NULL, CONSTRAINT [PK_LTC_CHG_ID] PRIMARY KEY CLUSTERED ([CHG_ID] ASC) ) ON [PRIMARY] GO -- Add defaults for key information ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_CHG_DATE] DEFAULT (getdate()) FOR [CHG_DATE]; ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_CHG_TYPE] DEFAULT ('') FOR [CHG_TYPE]; ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_CHG_BY] DEFAULT (coalesce(suser_sname(),'?')) FOR [CHG_BY]; ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_APP_NAME] DEFAULT (coalesce(app_name(),'?')) FOR [APP_NAME]; ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_HOST_NAME] DEFAULT (coalesce(host_name(),'?')) FOR [HOST_NAME]; GO -- -- 8 - Make DML trigger to capture changes -- -- Delete existing trigger IF OBJECT_ID('[ACTIVE].[TRG_FLUID_DATA]') IS NOT NULL DROP TRIGGER [ACTIVE].[TRG_FLUID_DATA] GO -- Add trigger to log all changes CREATE TRIGGER [ACTIVE].[TRG_FLUID_DATA] ON [ACTIVE].[CARS_BY_COUNTRY] FOR INSERT, UPDATE, DELETE AS BEGIN -- Detect inserts IF EXISTS (select * from inserted) AND NOT EXISTS (select * from deleted) BEGIN INSERT [AUDIT].[LOG_TABLE_CHANGES] ([CHG_TYPE], [SCHEMA_NAME], [OBJECT_NAME], [XML_RECSET]) SELECT 'INSERT', '[ACTIVE]', '[CARS_BY_COUNTRY]', (SELECT * FROM inserted as Record for xml auto, elements , root('RecordSet'), type) RETURN; END -- Detect deletes IF EXISTS (select * from deleted) AND NOT EXISTS (select * from inserted) BEGIN INSERT [AUDIT].[LOG_TABLE_CHANGES] ([CHG_TYPE], [SCHEMA_NAME], [OBJECT_NAME], [XML_RECSET]) SELECT 'DELETE', '[ACTIVE]', '[CARS_BY_COUNTRY]', (SELECT * FROM deleted as Record for xml auto, elements , root('RecordSet'), type) RETURN; END -- Update inserts IF EXISTS (select * from inserted) AND EXISTS (select * from deleted) BEGIN INSERT [AUDIT].[LOG_TABLE_CHANGES] ([CHG_TYPE], [SCHEMA_NAME], [OBJECT_NAME], [XML_RECSET]) SELECT 'UPDATE', '[ACTIVE]', '[CARS_BY_COUNTRY]', (SELECT * FROM deleted as Record for xml auto, elements , root('RecordSet'), type) RETURN; END END; GO -- -- 9 - Test DML trigger by updating, deleting and inserting data -- -- Execute an update UPDATE [ACTIVE].[CARS_BY_COUNTRY] SET COUNTRY_NAME = 'Czech Republic' WHERE COUNTRY_ID = 8 GO -- Remove all data DELETE FROM [ACTIVE].[CARS_BY_COUNTRY]; GO -- Execute the load EXECUTE [ACTIVE].[USP_LOAD_CARS_BY_COUNTRY]; GO -- Show the audit trail SELECT * FROM [AUDIT].[LOG_TABLE_CHANGES] GO -- Disable the trigger ALTER TABLE [ACTIVE].[CARS_BY_COUNTRY] DISABLE TRIGGER [TRG_FLUID_DATA];
** Look & Feel of audit table **
Logging changes is something I've generally done using triggers on a base table to record changes in a log table. The log table has additional columns to record the database user, action and date/time.
create trigger Table-A_LogDelete on dbo.Table-A for delete as declare @Now as DateTime = GetDate() set nocount on insert into Table-A-History select SUser_SName(), 'delete-deleted', @Now, * from deleted go exec sp_settriggerorder @triggername = 'Table-A_LogDelete', @order = 'last', @stmttype = 'delete' go create trigger Table-A_LogInsert on dbo.Table-A for insert as declare @Now as DateTime = GetDate() set nocount on insert into Table-A-History select SUser_SName(), 'insert-inserted', @Now, * from inserted go exec sp_settriggerorder @triggername = 'Table-A_LogInsert', @order = 'last', @stmttype = 'insert' go create trigger Table-A_LogUpdate on dbo.Table-A for update as declare @Now as DateTime = GetDate() set nocount on insert into Table-A-History select SUser_SName(), 'update-deleted', @Now, * from deleted insert into Table-A-History select SUser_SName(), 'update-inserted', @Now, * from inserted go exec sp_settriggerorder @triggername = 'Table-A_LogUpdate', @order = 'last', @stmttype = 'update'
Logging triggers should always be set to fire last. Otherwise, a subsequent trigger may rollback the original transaction, but the log table will have already been updated. This is a confusing state of affairs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With