I have a MS SQL Server with a linked MySQL server. I need to partially synchronize a table between the two servers. This is done in three steps and based on a condition:
Delete all rows from the MySQL table that do not satisfy the condition
Insert all new rows in the MySQL table that satisfy the condition
Update all rows in the MySQL server that satisfy the condition and have different data between MySQL and SQL Server
Steps 1 and 2 always run without a problem. But step 3 won't run if there is anything to update. The query fails with the following exception: The rowset was using optimistic concurrency and the value of a column has been changed after the containing row was last fetched or resynchronized.].
This is the query that is executed:
update mysqlserver...subscribers
set Firstname = Voornaam,
Middlename = Tussenvoegsel,
Surname = Achternaam,
email = e-mail
from mysqlserver...subscribers as b, tblkandidaat
where (b.kandidaatid = tblkandidaat.kandidaatid) and
(tblkandidaat.kandidaatid in (
select subsc.kandidaatid
from mysqlserver...subscribers subsc inner join tblKandidaat
on (subsc.kandidaatid=tblKandidaat.kandidaatid)
where (subsc.list=1) and
((subsc.firstname COLLATE Latin1_General_CI_AI <> Voornaam
or (subsc.middlename COLLATE Latin1_General_CI_AI <> Tussenvoegsel)
or (subsc.surname COLLATE Latin1_General_CI_AI <> tblKandidaat.Achternaam)
or (subsc.email COLLATE Latin1_General_CI_AI <> tblKandidaat.e-mail))
));
Anybody has an idea about how to prevent this?
Try this query instead:
update b
set
Firstname = Voornaam,
Middlename = Tussenvoegsel,
Surname = Achternaam,
email = e-mail
from
mysqlserver...subscribers b
inner join tblkandidaat k on b.kandidaatid = k.kandidaatid
where
b.list=1
and (
b.firstname COLLATE Latin1_General_CI_AI <> k.Voornaam
or b.middlename COLLATE Latin1_General_CI_AI <> k.Tussenvoegsel
or b.surname COLLATE Latin1_General_CI_AI <> k.Achternaam
or b.email COLLATE Latin1_General_CI_AI <> k.e-mail
)
It's best practice to use ANSI joins and properly separate JOIN conditions from WHERE conditions.
It's more readable to use aliases for all your tables instead of long table names throughout the query.
It's best to use the aliases for all column references instead of leaving them blank. Not only is it a good habit and makes things clearer, it can avoid some very nasty errors in inner-vs-outer table references.
If performance is also an issue: linked server joins sometimes devolve to row-by-row processing in the DB data provider engine. I have found cases where breaking out part of a complex join across a linked server into a regular join followed by a cross apply hugely reduced the unneeded rows being fetched and greatly improved performance. (This was essentially doing a bookmark lookup, aka a nonclustered index scan followed by clustered index seek using those values). While this may not perfectly mesh with how MySql works, it's worth experimenting with. If you can do any kind of trace to see the actual queries being performed on the MySql side you might get insight as to other methods to use for increased performance.
Another performance-improving idea is to copy the remote data locally to a temp table, and add an ActionRequired column. Then update the temp table so it looks like it should, putting 'U', 'I', or 'D' in ActionRequired, then perform the merge/upsert across the linked server with a simple equijoins on the primary key, using ActionRequired. Careful attention to possible race conditions where the remote database could be updated during processing are in order.
Beware of nulls... are all those columns you're comparing non-nullable?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With