There are two servers. The first is ERP system on production. The second one is the BI server for heavy analytical queries. We update the BI server on a daily basis via backups. However, it's not enough, some users want to see their data changes more often than the next day. I have no access to the ERP server and can't do anything, except asking for backups or replications.
Before starting to ask for the replication. I want to understand if it's possible to use subscriber triggers in order to process not all the data, but changed. There is an ETL process to make some queries faster (indexing, transformation, etc). Triggers should do the trick, but I can't find a way to use them on the subscriber side only. The ERP system doesn't allow to make any changes on the DB level. So, the subscriber database seems to be fine for triggers (they don't affect on the ERP server performance). Nonetheless, I can't find a way to set them up. Processing all the data is an insane overhead.
Use case: Simplified example, say, we have two replicated tables:
+------------+-------------+--------+
| dt | customer_id | amount |
+------------+-------------+--------+
| 2017-01-01 | 1 | 234 |
| 2017-01-02 | 1 | 123 |
+------------+-------------+--------+
+------------+-------------+------------+------------+
| manager_id | customer_id | date_from | date_to |
+------------+-------------+------------+------------+
| 1 | 1 | 2017-01-01 | 2017-01-02 |
| 2 | 1 | 2017-01-02 | null |
+------------+-------------+------------+------------+
I need to transform them into the following indexed table:
+----------+-------------+------------+--------+
| dt_id | customer_id | manager_id | amount |
+----------+-------------+------------+--------+
| 20170101 | 1 | 1 | 234 |
| 20170102 | 1 | 2 | 123 |
+----------+-------------+------------+--------+
So, I created yet another database where I store the table above. Now, in order to update the table I have to truncate it and reinsert all the data again. I may join them all the in order to check the diffs, but it's too heavy for big tables as well. The trigger helps to track only changing records. The first input table can use a trigger:
create trigger SaleInsert
on Table1
after insert
begin
insert into NewDB..IndexedTable
select
//some transformation
from inserted
left join Table2
on Table1.customer_id = Table2.customer_id
and Table1.dt >= Table2.date_from
and Table1.dt < Table2.date_to
end
The same idea for update, delete, a similar approach for the second table. I could get automatically updated DWH with little lags. Yeah, I expect performance lags for high-loaded databases. Theoretically, it should work smoothly on servers with the same configurations.
But, again, there are no triggers on the subscriber side only. Any ideas, alternatives?
An INSTEAD OF trigger is a trigger that allows you to update data in tables via their view which cannot be modified directly through DML statements. When you issue a DML statement such as INSERT , UPDATE , or DELETE to a non-updatable view, Oracle will issue an error.
The documentation states: In all cases, a trigger is executed as part of the same transaction as the statement that triggered it, so if either the statement or the trigger causes an error, the effects of both will be rolled back.
A trigger defines a set of actions that are performed in response to an insert, update, or delete operation on a specified table. When such an SQL operation is executed, the trigger is said to have been activated. Triggers are optional and are defined using the CREATE TRIGGER statement.
MS SQL Server has "Change Tracking" features that maybe be of use to you. You enable the database for change tracking and configure which tables you wish to track. SQL Server then creates change records on every update, insert, delete on a table and then lets you query for changes to records that have been made since the last time you checked. This is very useful for syncing changes and is more efficient than using triggers. It's also easier to manage than making your own tracking tables. This has been a feature since SQL Server 2005.
How to: Use SQL Server Change Tracking
Change tracking only captures the primary keys of the tables and let's you query which fields might have been modified. Then you can query the tables join on those keys to get the current data. If you want it to capture the data also you can use Change Capture, but it requires more overhead and at least SQL Server 2008 enterprise edition.
Change Data Capture
The general process is:
Then you repeat this process whenever you want to subscribe to the next set of changes. SQL Server does all the magic behind the scenes for you of storing the changes and the versioning. You also might want to look into Snapshot Isolation... it works well with it. The linked article has more information about that.
This answer is somewhat roundabout, but given your tight limitations perhaps you'll consider it.
First, go for replication as you seem to have decided. You mentioned creating yet another database but were stuck for how to create triggers to populate it. The answer lies in the ability to run post-snapshot scripts. When creating the replication publication the DBA can specify a script to run on the Subscriber after the Snapshot.
You can have the script create all the triggers you require.
Also, to prevent replication from overwriting your triggers with "no trigger" (as defined in the ERP database) the DBA will need to verify that for each table on which you have triggers the property Copy user triggers is set to False.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With