Atomic UPSERT in SQL Server 2005

Tags:

What is the correct pattern for doing an atomic "UPSERT" (UPDATE where exists, INSERT otherwise) in SQL Server 2005?

I see a lot of code on SO (e.g. see Check if a row exists, otherwise insert) with the following two-part pattern:

UPDATE ... FROM ... WHERE <condition> -- race condition risk here IF @@ROWCOUNT = 0   INSERT ...

IF (SELECT COUNT(*) FROM ... WHERE <condition>) = 0   -- race condition risk here   INSERT ... ELSE   UPDATE ...

where < condition > will be an evaluation of natural keys. None of the above approaches seem to deal well with concurrency. If I cannot have two rows with the same natural key, it seems like all of the above risk inserting rows with the same natural keys in race condition scenarios.

I have been using the following approach but I'm surprised not to see it anywhere in people's responses so I'm wondering what is wrong with it:

INSERT INTO <table> SELECT <natural keys>, <other stuff...> FROM <table> WHERE NOT EXISTS    -- race condition risk here?    ( SELECT 1 FROM <table> WHERE <natural keys> )  UPDATE ... WHERE <natural keys>

Note that the race condition mentioned here is a different one from the ones in the earlier code. In the earlier code, the issue was phantom reads (rows being inserted between the UPDATE/IF or between the SELECT/INSERT by another session). In the above code, the race condition has to do with DELETEs. Is it possible for a matching row to be deleted by another session AFTER the (WHERE NOT EXISTS) executes but before the INSERT executes? It's not clear where the WHERE NOT EXISTS puts a lock on anything in conjunction with the UPDATE.

Is this atomic? I can't locate where this would be documented in SQL Server documentation.

EDIT: I realise this could be done with transactions, but I think I would need to set the transaction level to SERIALIZABLE to avoid the phantom read problem? Surely that is overkill for such a common problem?

506

asked Mar 26 '10 10:03

rabidpebble

2 Answers

INSERT INTO <table> SELECT <natural keys>, <other stuff...> FROM <table> WHERE NOT EXISTS    -- race condition risk here?    ( SELECT 1 FROM <table> WHERE <natural keys> )  UPDATE ... WHERE <natural keys>

there is a race condition in the first INSERT. The key may not exists during the inner query SELECT, but does exist at INSERT time resulting in key violation.
there is a race condition between the INSERT and UPDATE. The key may exist when checked in the inner query of the INSERT but is gone by the time UPDATE runs.

For the second race condition one could argue that the key would have been deleted anyway by the concurrent thread, so it's not really a lost update.

The optimal solution is usually to try the most likely case, and handle the error if it fails (inside a transaction, of course):

if the key is likely missing, always insert first. Handle the unique constraint violation, fallback to update.
if the key is likely present, always update first. Insert if no row was found. Handle possible unique constraint violation, fallback to update.

Besides correctness, this pattern is also optimal for speed: is more efficient to try to insert and handle the exception than to do spurious lockups. Lockups mean logical page reads (which may mean physical page reads), and IO (even logical) is more expensive than SEH.

Update @Peter

Why isn't a single statement 'atomic'? Let's say we have a trivial table:

create table Test (id int primary key);

Now if I'd run this single statement from two threads, in a loop, it would be 'atomic', as you say, an no race condition can exist:

  insert into Test (id)     select top (1) id     from Numbers n     where not exists (select id from Test where id = n.id);

Yet in only a couple of seconds, a primary key violation occurs:

Msg 2627, Level 14, State 1, Line 4
Violation of PRIMARY KEY constraint 'PK__Test__24927208'. Cannot insert duplicate key in object 'dbo.Test'.

Why is that? You are correct in that SQL query plan will do the 'right thing' on DELETE ... FROM ... JOIN, on WITH cte AS (SELECT...FROM ) DELETE FROM cte and in many other cases. But there is a crucial difference in these cases: the 'subquery' refers to the target of an update or delete operation. For such cases the query plan will indeed use an appropriate lock, in fact I this behavior is critical on certain cases, like when implementing queues Using tables as Queues.

But in the original question, as well as in my example, the subquery is seen by the query optimizer just as a subquery in a query, not as some special 'scan for update' type query that needs special lock protection. The result is that the execution of the subquery lookup can be observed as a distinct operation by a concurent observerver, thus breaking the 'atomic' behavior of the statement. Unless special precaution is taken, multiple threads can attempt to insert the same value, both convinced they had checked and the value doesn't already exists. Only one can succeed, the other will hit the PK violation. QED.

149

answered Oct 09 '22 03:10

Remus Rusanu

Pass updlock, rowlock, holdlock hints when testing for existence of the row. Holdlock ensures that all inserts are serialised; rowlock permits concurrent updates to existing rows.

Updates may still block if your PK is a bigint, as the internal hashing is degenerate for 64-bit values.

begin tran -- default read committed isolation level is fine  if not exists (select * from <table> with (updlock, rowlock, holdlock) where <PK = ...>     -- insert else     -- update  commit

answered Oct 09 '22 02:10

Cassius Porcus

Related questions
                            
                                OPENJSON does not work in SQL Server?
                            
                                Retrieve column names and types of a stored procedure? [duplicate]
                            
                                An attempt to login using SQL authentication failed
                            
                                How to add a Try/Catch to SQL Stored Procedure
                            
                                SQL - Difference between COALESCE and ISNULL? [duplicate]
                            
                                How can INSERT INTO a table 300 times within a loop in SQL?
                            
                                Update if different/changed
                            
                                SQL Server giving logins(users) db_owner access to database
                            
                                Find duplicate records in a table using SQL Server
                            
                                T-SQL: Lock a table manually for some minutes [duplicate]
                            
                                Trimming text strings in SQL Server 2008
                            
                                SqlServer restore database error
                            
                                A SQL Query to select a string between two known strings
                            
                                Update Top 1 record in table sql server [duplicate]
                            
                                How to find all trigger associated with a table with SQL Server?
                            
                                How do I decide when to use right joins/left joins or inner joins Or how to determine which table is on which side?
                            
                                How to check if cursor exists (open status)
                            
                                Is VARCHAR like totally 1990s? [closed]
                            
                                SQL Server 2008 R2 can't connect to local database in Management Studio
                            
                                integer Max value constants in SQL Server T-SQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Atomic UPSERT in SQL Server 2005

Tags:

sql-server

sql-server-2005

atomic

upsert

rabidpebble

People also ask

2 Answers

Remus Rusanu

Cassius Porcus

Recent Activity

Donate For Us