I've been searching around for the answers to this question, and there's some conflicting or ambiguous information out there, finding it hard to find a for-sure answer.
My context: I'm in node.js using the 'mssql' npm package. My SQL server is Microsoft SQL Server 2014.
I have a record that may or may not exist in a table already -- if it exists I want to update it, otherwise I want to insert it. I'm not sure what the optimal SQL is, or if there's some kind of 'transaction' I should be running in mssql. I've found some options that seem good, but I'm not sure about any of them:
Option 1: how to update if exists or insert
Problem with this is I'm not even sure this is valid syntax in MSSQL. I do like it though, and it seems to support doing multiple rows at once too which I like.
INSERT INTO table (id, user, date, points)
VALUES (1, 1, '2017-03-03', 25),
(2, 1, '2017-03-04', 25),
(3, 2, '2017-03-03', 100),
(4, 2, '2017-03-04', 150)
ON DUPLICATE KEY UPDATE points = VALUES(points)
Option 2:
don't know if there's any problem with this one, just not sure if it's optimal. Doesn't seem to support multiple simultaneous rows
update test set name='john' where id=3012
IF @@ROWCOUNT=0
insert into test(name) values('john');
Option 3: Merge, https://dba.stackexchange.com/questions/89696/how-to-insert-or-update-using-single-query
Some people say this is a bit buggy or something? This also apparently supports multiple at once which I like.
MERGE dbo.Test WITH (SERIALIZABLE) AS T
USING (VALUES (3012, 'john')) AS U (id, name)
ON U.id = T.id
WHEN MATCHED THEN
UPDATE SET T.name = U.name
WHEN NOT MATCHED THEN
INSERT (id, name)
VALUES (U.id, U.name);
Every one of them has different purpose, pros and cons.
Option 1 is good for multi row inserts/updates. However It only checks primary key constraints.
Option 2 is good for small sets of data. Single record insertion/update. It is more like script.
Option 3 is best for big queries. Lets say, reading from one table and inserting/updating to another accordingly. You can define which condition to be satisfied for insertion and/or update. You are not limited to primary key/unique constraint.
If your system is highly concurrent, and performance is important - you can try following pattern, if updates are more common than inserts:
BEGIN TRANSACTION;
UPDATE dbo.t WITH (UPDLOCK, SERIALIZABLE) SET val = @val WHERE [key] = @key;
IF @@ROWCOUNT = 0
BEGIN
INSERT dbo.t([key], val) VALUES(@key, @val);
END
COMMIT TRANSACTION;
Reference: https://sqlperformance.com/2020/09/locking/upsert-anti-pattern
Also read: https://michaeljswart.com/2017/07/sql-server-upsert-patterns-and-antipatterns/
If inserts are more common:
BEGIN TRY
INSERT INTO dbo.AccountDetails (Email, Etc) VALUES (@Email, @Etc);
END TRY
BEGIN CATCH
-- ignore duplicate key errors, throw the rest.
IF ERROR_NUMBER() IN (2601, 2627)
UPDATE dbo.AccountDetails
SET Etc = @Etc
WHERE Email = @Email;
END CATCH
I wouldn't use merge, while most of the bugs are apparently fixed - we have had major issues with it before in production.
EDIT ---
Yes above answers were for single rows - For multiple rows, you'd do something like this: The idea behind the locking is the same though
BEGIN TRANSACTION;
UPDATE t WITH (UPDLOCK, SERIALIZABLE)
SET val = tvp.val
FROM dbo.t AS t
INNER JOIN @tvp AS tvp
ON t.[key] = tvp.[key];
INSERT dbo.t([key], val)
SELECT [key], val FROM @tvp AS tvp
WHERE NOT EXISTS (SELECT 1 FROM dbo.t WHERE [key] = tvp.[key]);
COMMIT TRANSACTION;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With