There are two servers. The first is ERP system on production. The second one is the BI server for heavy analytical queries. We update the BI server on a daily basis via backups. However, it's not enough, some users want to see their data changes more often than the next day. I have no access to the ERP server and can't do anything, except asking for backups or replications. Before starting to ask for the replication. I want to understand if it's possible to use subscriber triggers in order to process not all the data, but changed. There is an ETL process to make some queries faster (indexing, transformation, etc). Triggers should do the trick, but I can't find a way to use them on the subscriber side only. The ERP system doesn't allow to make any changes on the DB level. So, the subscriber database seems to be fine for triggers (they don't affect on the ERP server performance). Nonetheless, I can't find a way to set them up. Processing all the data is an insane overhead. Use case: Simplified example, say, we have two replicated tables: <pre class="prettyprint"><code>+------------+-------------+--------+ | dt | customer_id | amount | +------------+-------------+--------+ | 2017-01-01 | 1 | 234 | | 2017-01-02 | 1 | 123 | +------------+-------------+--------+ +------------+-------------+------------+------------+ | manager_id | customer_id | date_from | date_to | +------------+-------------+------------+------------+ | 1 | 1 | 2017-01-01 | 2017-01-02 | | 2 | 1 | 2017-01-02 | null | +------------+-------------+------------+------------+ </code></pre> I need to transform them into the following indexed table: <pre class="prettyprint"><code>+----------+-------------+------------+--------+ | dt_id | customer_id | manager_id | amount | +----------+-------------+------------+--------+ | 20170101 | 1 | 1 | 234 | | 20170102 | 1 | 2 | 123 | +----------+-------------+------------+--------+ </code></pre> So, I created yet another database where I store the table above. Now, in order to update the table I have to truncate it and reinsert all the data again. I may join them all the in order to check the diffs, but it's too heavy for big tables as well. The trigger helps to track only changing records. The first input table can use a trigger: <pre class="prettyprint"><code>create trigger SaleInsert on Table1 after insert begin insert into NewDB..IndexedTable select //some transformation from inserted left join Table2 on Table1.customer_id = Table2.customer_id and Table1.dt >= Table2.date_from and Table1.dt < Table2.date_to end </code></pre> The same idea for update, delete, a similar approach for the second table. I could get automatically updated DWH with little lags. Yeah, I expect performance lags for high-loaded databases. Theoretically, it should work smoothly on servers with the same configurations. But, again, there are no triggers on the subscriber side only. Any ideas, alternatives?

This answer is somewhat roundabout, but given your tight limitations perhaps you'll consider it. First, go for replication as you seem to have decided. You mentioned creating yet another database but were stuck for how to create triggers to populate it. The answer lies in the ability to run post-snapshot scripts. When creating the replication publication the DBA can specify a script to run on the Subscriber after the Snapshot. <img src="https://i.stack.imgur.com/qGTYA.jpg" alt="Post-snapshot script"> You can have the script create all the triggers you require. Also, to prevent replication from overwriting your triggers with "no trigger" (as defined in the ERP database) the DBA will need to verify that for each table on which you have triggers the property Copy user triggers is set to False. <img src="https://i.stack.imgur.com/9ZdjW.jpg" alt="Copy user triggers">

Subscriber-only trigger

Tags:

sql-server

tsql

triggers

replication

database-trigger

There are two servers. The first is ERP system on production. The second one is the BI server for heavy analytical queries. We update the BI server on a daily basis via backups. However, it's not enough, some users want to see their data changes more often than the next day. I have no access to the ERP server and can't do anything, except asking for backups or replications.

Before starting to ask for the replication. I want to understand if it's possible to use subscriber triggers in order to process not all the data, but changed. There is an ETL process to make some queries faster (indexing, transformation, etc). Triggers should do the trick, but I can't find a way to use them on the subscriber side only. The ERP system doesn't allow to make any changes on the DB level. So, the subscriber database seems to be fine for triggers (they don't affect on the ERP server performance). Nonetheless, I can't find a way to set them up. Processing all the data is an insane overhead.

Use case: Simplified example, say, we have two replicated tables:

+------------+-------------+--------+
|     dt     | customer_id | amount |
+------------+-------------+--------+
| 2017-01-01 |           1 |    234 |
| 2017-01-02 |           1 |    123 |
+------------+-------------+--------+

+------------+-------------+------------+------------+
| manager_id | customer_id | date_from  |  date_to   |
+------------+-------------+------------+------------+
|          1 |           1 | 2017-01-01 | 2017-01-02 |
|          2 |           1 | 2017-01-02 |       null |
+------------+-------------+------------+------------+

I need to transform them into the following indexed table:

+----------+-------------+------------+--------+
|  dt_id   | customer_id | manager_id | amount |
+----------+-------------+------------+--------+
| 20170101 |           1 |          1 |    234 |
| 20170102 |           1 |          2 |    123 |
+----------+-------------+------------+--------+

So, I created yet another database where I store the table above. Now, in order to update the table I have to truncate it and reinsert all the data again. I may join them all the in order to check the diffs, but it's too heavy for big tables as well. The trigger helps to track only changing records. The first input table can use a trigger:

create trigger SaleInsert
on Table1
after insert 
begin 
    insert into NewDB..IndexedTable
    select 
    //some transformation
    from inserted
    left join Table2
    on Table1.customer_id = Table2.customer_id
    and Table1.dt >= Table2.date_from
    and Table1.dt < Table2.date_to
end

The same idea for update, delete, a similar approach for the second table. I could get automatically updated DWH with little lags. Yeah, I expect performance lags for high-loaded databases. Theoretically, it should work smoothly on servers with the same configurations.

But, again, there are no triggers on the subscriber side only. Any ideas, alternatives?

693

asked Oct 10 '17 07:10

Danylo Korostil

2 Answers

MS SQL Server has "Change Tracking" features that maybe be of use to you. You enable the database for change tracking and configure which tables you wish to track. SQL Server then creates change records on every update, insert, delete on a table and then lets you query for changes to records that have been made since the last time you checked. This is very useful for syncing changes and is more efficient than using triggers. It's also easier to manage than making your own tracking tables. This has been a feature since SQL Server 2005.

How to: Use SQL Server Change Tracking

Change tracking only captures the primary keys of the tables and let's you query which fields might have been modified. Then you can query the tables join on those keys to get the current data. If you want it to capture the data also you can use Change Capture, but it requires more overhead and at least SQL Server 2008 enterprise edition.

Change Data Capture

The general process is:

Get the current sync version
Get the last sync version you used to get changes
Get all the primary keys of the tables that have changed since that last version (inserts, updates, and deletes).
Join the keys with the data and pull down the data (if not using change capture).
Save the data and the current sync version (from the first step).

Then you repeat this process whenever you want to subscribe to the next set of changes. SQL Server does all the magic behind the scenes for you of storing the changes and the versioning. You also might want to look into Snapshot Isolation... it works well with it. The linked article has more information about that.

187

answered Nov 04 '22 00:11

Brian Pressler

This answer is somewhat roundabout, but given your tight limitations perhaps you'll consider it.

First, go for replication as you seem to have decided. You mentioned creating yet another database but were stuck for how to create triggers to populate it. The answer lies in the ability to run post-snapshot scripts. When creating the replication publication the DBA can specify a script to run on the Subscriber after the Snapshot.

Post-snapshot script

You can have the script create all the triggers you require.

Also, to prevent replication from overwriting your triggers with "no trigger" (as defined in the ERP database) the DBA will need to verify that for each table on which you have triggers the property Copy user triggers is set to False.

Copy user triggers

answered Nov 04 '22 00:11

Wayne Ivory

Related questions
                            
                                pymssql executes stored procedure but returns no results
                            
                                SQL Sever 2012 - generating scripts - Save to File = Not Run
                            
                                How to insert three new rows for every result of a SELECT query into the same table
                            
                                Entity Framework Non Identity - Cannot insert the value NULL into column 'ID'
                            
                                Can we access Microsoft SQL Server database on WAN? [closed]
                            
                                Add DEFAULT constraint with a generated name
                            
                                SQL - If string in a column from one table contains string in column from joined table
                            
                                Character set mismatch on Linux with ODBC to SQL Server
                            
                                How to format date on MS SQL Server 2008
                            
                                Speed up inserts into SQL Server from pyodbc
                            
                                Heroku with SQL Server
                            
                                SSMS 2012: Convert DATETIME to Excel serial number
                            
                                How to backup Symmetric Key in SQL Server?
                            
                                TRY_CAST is not a recognized built-in function name
                            
                                Specifying login creditials for remote SQL connection string in ASP.Net/Entity Framework Core
                            
                                Lead window function with duplicate dates
                            
                                Reset the JDBC Kafka Connector to start pulling rows from the beginning of time?
                            
                                SQL Server CTE hierarchy issue
                            
                                How to read a property with a long value from JSON string?
                            
                                Sum results from two different queries (SQL)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With