Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sql Merge statement causes duplication while multithreading

I have a merge statment that performs an upsert based on a key. (for simplification, we will just call it RefId). This is to ensure that the table is unique by RefId. However, in production we have multiple servers that insert into this table using this stored proc, and if two servers insert using the same RefId at very close intervals, duplication occurs (i.e 2 inserts) instead of 1 insert and update. I believe it is because, SQL server locks the newly inserted row and the other parallel called stored proc is unable to detect it's existence. NoLock is not supported for the MERGE statement so there's no obvious workaround that I can see. I have simulated parallel inserts using multiple threads (instead of servers) and ocassionally duplication occurs in this case as well. Other than enforcing a unique constraint on the DB, (which I can't for reasons not in my control) is there any way I can get my upsert to work as expected in concurrent cases?

Here's the stored proc (it works correctly when the insert-updates are made with a reasonable timing difference, just fails in parallel cases)

WITH UniqueData AS 
    ( SELECT * FROM 
      ( SELECT *, rank() over ( PARTITION BY RefId ) AS UniqueRank 
          FROM @Data  
      ) AS Ranked WHERE UniqueRank=1 
    )
  MERGE MyTable AS destination
  USING UniqueData AS source
  ON ( destination.RefId = source.RefId)
  WHEN NOT MATCHED THEN
    INSERT (RefId, Miles,UpdateUTC)
    VALUES( source.RefId, source.Miles, getutcdate())
  WHEN MATCHED THEN
    UPDATE SET 
               destination.Miles = ISNULL(source.Miles , destination.Miles ),
               destination.UpdateUTC = getutcdate()   
like image 909
arviman Avatar asked May 07 '14 05:05

arviman


People also ask

How do you fix the MERGE statement attempted to update or DELETE the same row more than once?

A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

Is MERGE into faster than update?

The UPDATE statement will most likely be more efficient than a MERGE if the all you are doing is updating rows. Given the complex nature of the MERGE command's match condition, it can result in more overhead to process the source and target rows.

How do you MERGE duplicates in SQL?

To combine, use GROUP_CONCAT() function to combine some attributes in two rows into one. As a separator, use hyphens.


1 Answers

You can use WITH (HOLDLOCK)

...
 MERGE MyTable WITH (HOLDLOCK) AS destination
...

And you can read the details here: http://weblogs.sqlteam.com/dang/archive/2009/01/31/UPSERT-Race-Condition-With-MERGE.aspx

like image 125
PeterRing Avatar answered Oct 03 '22 06:10

PeterRing