Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subscriber-only trigger

There are two servers. The first is ERP system on production. The second one is the BI server for heavy analytical queries. We update the BI server on a daily basis via backups. However, it's not enough, some users want to see their data changes more often than the next day. I have no access to the ERP server and can't do anything, except asking for backups or replications.

Before starting to ask for the replication. I want to understand if it's possible to use subscriber triggers in order to process not all the data, but changed. There is an ETL process to make some queries faster (indexing, transformation, etc). Triggers should do the trick, but I can't find a way to use them on the subscriber side only. The ERP system doesn't allow to make any changes on the DB level. So, the subscriber database seems to be fine for triggers (they don't affect on the ERP server performance). Nonetheless, I can't find a way to set them up. Processing all the data is an insane overhead.

Use case: Simplified example, say, we have two replicated tables:

+------------+-------------+--------+
|     dt     | customer_id | amount |
+------------+-------------+--------+
| 2017-01-01 |           1 |    234 |
| 2017-01-02 |           1 |    123 |
+------------+-------------+--------+

+------------+-------------+------------+------------+
| manager_id | customer_id | date_from  |  date_to   |
+------------+-------------+------------+------------+
|          1 |           1 | 2017-01-01 | 2017-01-02 |
|          2 |           1 | 2017-01-02 |       null |
+------------+-------------+------------+------------+

I need to transform them into the following indexed table:

+----------+-------------+------------+--------+
|  dt_id   | customer_id | manager_id | amount |
+----------+-------------+------------+--------+
| 20170101 |           1 |          1 |    234 |
| 20170102 |           1 |          2 |    123 |
+----------+-------------+------------+--------+

So, I created yet another database where I store the table above. Now, in order to update the table I have to truncate it and reinsert all the data again. I may join them all the in order to check the diffs, but it's too heavy for big tables as well. The trigger helps to track only changing records. The first input table can use a trigger:

create trigger SaleInsert
on Table1
after insert 
begin 
    insert into NewDB..IndexedTable
    select 
    //some transformation
    from inserted
    left join Table2
    on Table1.customer_id = Table2.customer_id
    and Table1.dt >= Table2.date_from
    and Table1.dt < Table2.date_to
end

The same idea for update, delete, a similar approach for the second table. I could get automatically updated DWH with little lags. Yeah, I expect performance lags for high-loaded databases. Theoretically, it should work smoothly on servers with the same configurations.

But, again, there are no triggers on the subscriber side only. Any ideas, alternatives?

like image 693
Danylo Korostil Avatar asked Oct 10 '17 07:10

Danylo Korostil


People also ask

Why would you use an instead of trigger?

An INSTEAD OF trigger is a trigger that allows you to update data in tables via their view which cannot be modified directly through DML statements. When you issue a DML statement such as INSERT , UPDATE , or DELETE to a non-updatable view, Oracle will issue an error.

Is trigger in same transaction?

The documentation states: In all cases, a trigger is executed as part of the same transaction as the statement that triggered it, so if either the statement or the trigger causes an error, the effects of both will be rolled back.

What is trigger and its types?

A trigger defines a set of actions that are performed in response to an insert, update, or delete operation on a specified table. When such an SQL operation is executed, the trigger is said to have been activated. Triggers are optional and are defined using the CREATE TRIGGER statement.


2 Answers

MS SQL Server has "Change Tracking" features that maybe be of use to you. You enable the database for change tracking and configure which tables you wish to track. SQL Server then creates change records on every update, insert, delete on a table and then lets you query for changes to records that have been made since the last time you checked. This is very useful for syncing changes and is more efficient than using triggers. It's also easier to manage than making your own tracking tables. This has been a feature since SQL Server 2005.

How to: Use SQL Server Change Tracking

Change tracking only captures the primary keys of the tables and let's you query which fields might have been modified. Then you can query the tables join on those keys to get the current data. If you want it to capture the data also you can use Change Capture, but it requires more overhead and at least SQL Server 2008 enterprise edition.

Change Data Capture

The general process is:

  • Get the current sync version
  • Get the last sync version you used to get changes
  • Get all the primary keys of the tables that have changed since that last version (inserts, updates, and deletes).
  • Join the keys with the data and pull down the data (if not using change capture).
  • Save the data and the current sync version (from the first step).

Then you repeat this process whenever you want to subscribe to the next set of changes. SQL Server does all the magic behind the scenes for you of storing the changes and the versioning. You also might want to look into Snapshot Isolation... it works well with it. The linked article has more information about that.

like image 187
Brian Pressler Avatar answered Nov 04 '22 00:11

Brian Pressler


This answer is somewhat roundabout, but given your tight limitations perhaps you'll consider it.

First, go for replication as you seem to have decided. You mentioned creating yet another database but were stuck for how to create triggers to populate it. The answer lies in the ability to run post-snapshot scripts. When creating the replication publication the DBA can specify a script to run on the Subscriber after the Snapshot.

Post-snapshot script

You can have the script create all the triggers you require.

Also, to prevent replication from overwriting your triggers with "no trigger" (as defined in the ERP database) the DBA will need to verify that for each table on which you have triggers the property Copy user triggers is set to False.

Copy user triggers

like image 21
Wayne Ivory Avatar answered Nov 04 '22 00:11

Wayne Ivory