I have a table with some persistent data in it. Now when I query it, I also have a pretty complex CTE which computes the values required for the result and I need to insert missing rows into the persistent table. In the end I want to select the result consisting of all the rows identified by the CTE but with the data from the table if they were already in the table, and I need the information whether a row has been just inserted or not.
Simplified this works like this (the following code runs as a normal query if you like to try it):
-- Set-up of test data, this would be the persisted table DECLARE @target TABLE (id int NOT NULL PRIMARY KEY) ; INSERT INTO @target (id) SELECT v.id FROM (VALUES (1), (2)) v(id); -- START OF THE CODE IN QUESTION -- The result table variable (will be several columns in the end) DECLARE @result TABLE (id int NOT NULL, new bit NOT NULL) ; WITH Source AS ( -- Imagine a fairly expensive, recursive CTE here SELECT * FROM (VALUES (1), (3)) AS Source (id) ) MERGE INTO @target AS Target USING Source ON Target.id = Source.id -- Perform a no-op on the match to get the output record WHEN MATCHED THEN UPDATE SET Target.id=Target.id WHEN NOT MATCHED BY TARGET THEN INSERT (id) VALUES (SOURCE.id) -- select the data to be returned - will be more columns OUTPUT source.id, CASE WHEN $action='INSERT' THEN CONVERT(bit, 1) ELSE CONVERT(bit, 0) END INTO @result ; -- Select the result SELECT * FROM @result;
I don't like the WHEN MATCHED THEN UPDATE
part, I'd rather leave the redundant update away but then I don't get the result row in the OUTPUT
clause.
Is this the most efficient way to do this kind of completing and returning data?
Or would there be a more efficient solution without MERGE
, for instance by pre-computing the result with a SELECT
and then perform an INSERT
of the rows which are new=0
? I have difficulties interpreting the query plan since it basically boils down to a "Clustered Index Merge" which is pretty vague to me performance-wise compared to the separate SELECT
followed by INSERT
variant. And I wonder if SQL Server (2008 R2 with CU1) is actually smart enough to see that the UPDATE
is a no-op (e.g. no write required).
First, you specify the target table and the source table in the MERGE clause. Second, the merge_condition determines how the rows from the source table are matched to the rows from the target table. It is similar to the join condition in the join clause.
A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows. Note You may still receive this error message when no duplicate rows are caused by the join operation.
We cannot use WHEN NOT MATCHED BY SOURCE clause more than two times. If WHEN NOT MATCHED BY SOURCE clause in SQL Server MERGE statement was specified two times, one must use an update operation and another one must use delete operation.
You could declare a dummy variable and set its value in the WHEN MATCHED clause.
DECLARE @dummy int; ... MERGE ... WHEN MATCHED THEN UPDATE SET @dummy = 0 ...
I believe it should be less expensive than the actual table update.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With