Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data Sync between tables from SQL Server & Db2 - how to delete rows

Tags:

sql-server

db2

We have two databases one in SQL Server & one in DB2, we have a scenario where we do some data inserts & data updates and deletes in SQL Server & at the same time we also do data inserts updates & deletes in Db2.

We sync data back & forth using some processes, whenever there is a change from SQL Server we sync data to db2 for insert, update & delete, if we have a change in db2 we sync data to SQL Server, we use IBM MQ messages which we dequeue the messages to sync the changes back and forth.

Everything was good until we had some issues of data sync from Db2 to SQL Server, one of our process was down which sync from db2 to SQL Server, so there is an on demand job that runs every night that will do full data refresh from Db2 to SQL Server but we are only doing Merge Update & insert, we are not doing delete as data which is yet to be synced to db2 is also present in SQL Server, so we cannot directly delete as both databases can have more or less records, so data on SQL Server some of them are left orphan, we have a scoping so data which is getting updated in SQL Server cannot be change in db2 and vice versa.

My question is when we are syncing from Db2 to SQL Server, how to identify records that got deleted from db2 only so that we can delete those from SQL Server, we don't want to delete records that are created in SQL Server but yet to be sent to db2, we have 114 tables and we cannot maintain a flag if that is an option to differentiate.

like image 533
VR1256 Avatar asked Mar 02 '23 22:03

VR1256


2 Answers

When you said you are synchronizing data back and forth between MS SQL Server and DB2 Server, how are you capturing the changes? If using some CDC tool (IDR, GoldenGate, Informatica), these tools allow you to detect conflicts so you can decide what records to keep or delete.

If you are capturing your changes by an in-house development (triggers or your own log scraper ), you should keep at least the operation type and timestamp in your temporary change data set, so that you can recognize the operation.

If you are comparing the tables and deal with changes, you won't be able to recognize if missing columns at DB2 side represents rows deleted on DB2 side or rows added to SQL side... But you can fix that, by developing a proper change data capture mechanism.

like image 62
Daniel Lema Avatar answered Mar 06 '23 00:03

Daniel Lema


Change tracking on the sql server side might be a viable option (as long as all the tables you would like to sync/"delete from" have a primary key).

With CT you could track which rows, for each table, were created at the sql server side since the last sync from sql server to db2. Those rows should not be deleted yet:

DELETE 
FROM SQL_SERVER_TABLE
WHERE
  NOT EXISTS(SELECT * FROM CHANGETABLE())
  AND NOT EXISTS(SELECT * FROM DB2_staging)
like image 42
lptr Avatar answered Mar 06 '23 01:03

lptr