Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimistic concurrency on multi-table complex entity

I have a complex entity (let's call it Thing) which is represented in SQL Server as many tables: one parent table dbo.Thing with several child tables dbo.ThingBodyPart, dbo.ThingThought, etc. We've implemented optimistic concurrency using a single rowversion column on dbo.Thing, using the UPDATE OUTPUT INTO technique. This has been working great, until we added a trigger to dbo.Thing. I'm looking for advice in choosing a different approach, because I'm fairly convinced that my current approach cannot be fixed.

Here is our current code:

CREATE PROCEDURE dbo.UpdateThing
    @id uniqueidentifier,
    -- ...
    -- ... other parameters describing what to update...
    -- ...
    @rowVersion binary(8) OUTPUT
AS
BEGIN TRANSACTION;
BEGIN TRY

    -- ...
    -- ... update lots of Thing's child rows...
    -- ...

    DECLARE @t TABLE (
        [RowVersion] binary(8) NOT NULL
    );

    UPDATE dbo.Thing
    SET ModifiedUtc = sysutcdatetime()
    OUTPUT INSERTED.[RowVersion] INTO @t
    WHERE
        Id = @id
        AND [RowVersion] = @rowVersion;

    IF @@ROWCOUNT = 0 RAISERROR('Thing has been updated by another user.', 16, 1);

    COMMIT;

    SELECT @rowVersion = [RowVersion] FROM @t;

END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK;
    EXEC usp_Rethrow_Error;
END CATCH

This worked absolutely beautifully, until we added an INSTEAD OF UPDATE trigger to dbo.Thing. Now the stored procedure no longer returns the new @rowVersion value, but returns the old unmodified value. I'm at a loss. Are there other ways to approach optimistic concurrency that would be as effective and easy as the one above, but would also work with triggers?


To illustrate what exactly goes wrong with this code, consider this test code:

DECLARE
    @id uniqueidentifier = 'b0442c71-dbcb-4e0c-a178-1a01b9efaf0f',
    @oldRowVersion binary(8),
    @newRowVersion binary(8),
    @expected binary(8);

SELECT @oldRowVersion = [RowVersion]
FROM dbo.Thing
WHERE Id = @id;

PRINT '@oldRowVersion = ' + convert(char(18), @oldRowVersion, 1);

DECLARE @t TABLE (
    [RowVersion] binary(8) NOT NULL
);

UPDATE dbo.Thing
SET ModifiedUtc = sysutcdatetime()
OUTPUT INSERTED.[RowVersion] INTO @t
WHERE
    Id = @id
    AND [RowVersion] = @oldRowVersion;

PRINT '@@ROWCOUNT = ' + convert(varchar(10), @@ROWCOUNT);

SELECT @newRowVersion = [RowVersion] FROM @t;

PRINT '@newRowVersion = ' + convert(char(18), @newRowVersion, 1);

SELECT @expected = [RowVersion]
FROM dbo.Thing
WHERE Id = @id;

PRINT '@expected = ' + convert(char(18), @expected, 1);

IF @newRowVersion = @expected PRINT 'Pass!'
ELSE PRINT 'Fail.  :('

When the trigger is not present, this code correctly outputs:

@oldRowVersion = 0x0000000000016CDC

(1 row(s) affected)
@@ROWCOUNT = 1
@newRowVersion = 0x000000000004E9D1
@expected = 0x000000000004E9D1
Pass!

When the trigger is present, we do not receive the expected value:

@oldRowVersion = 0x0000000000016CDC

(1 row(s) affected)

(1 row(s) affected)
@@ROWCOUNT = 1
@newRowVersion = 0x0000000000016CDC
@expected = 0x000000000004E9D1
Fail.  :(

Any ideas for a different approach?

I was assuming that an UPDATE was an atomic operation, which it is, except when there are triggers, when apparently it's not. Am I wrong? This seems really bad, in my opinion, with potential concurrency bugs lurking behind every statement. If the trigger really is INSTEAD OF, shouldn't I get back the correct timestamp, as though the trigger's UPDATE was the one I actually executed? Is this a SQL Server bug?

like image 748
Daniel Schilling Avatar asked Feb 05 '14 19:02

Daniel Schilling


People also ask

How does optimistic concurrency work?

Optimistic concurrency derives its name from the optimistic assumption that collisions between transactions will rarely occur; a collision is said to have occurred when another transaction updates or deletes a row of data between the time it is read by the current transaction and the time it is updated or deleted.

What is the advantage of optimistic vs pessimistic concurrency control?

These are so called because pessimistic concurrency control proactively prevents harm (harm, in this case, being a violation of the isolation property), whereas optimistic concurrency control assumes that no harm will happen, but if it is detected only then will measures be taken.

What describes an optimistic concurrency model?

In an optimistic concurrency model, a violation is considered to have occurred if, after a user receives a value from the database, another user modifies the value before the first user has attempted to modify it.


1 Answers

One of my esteemed co-workers, Jonathan MacCollum, pointed me to this bit of documentation:

INSERTED

Is a column prefix that specifies the value added by the insert or update operation. Columns prefixed with INSERTED reflect the value after the UPDATE, INSERT, or MERGE statement is completed but before triggers are executed.

From this, I presume that I need to modify my stored procedure, splitting the one UPDATE into an UPDATE followed by a SELECT [RowVersion] ....

UPDATE dbo.Thing
SET ModifiedUtc = sysutcdatetime()
WHERE
    Id = @id
    AND [RowVersion] = @rowVersion;

IF @@ROWCOUNT = 0 RAISERROR('Thing has been updated by another user.', 16, 1);

COMMIT;

SELECT @rowVersion = [RowVersion]
FROM dbo.Thing
WHERE Id = @id;

I think I can still rest assured that my stored procedure is not accidentally overwriting anybody else's changes, but I should no longer assume that the data that the caller of the stored procedure holds is still up-to-date. There's a chance that the new @rowVersion value returned by the stored procedure is actually the result of someone else's update, not mine. So actually, there's no point in returning the @rowVersion at all. After executing this stored procedure, the caller should re-fetch the Thing and all of its child records in order to be sure its picture of the data is consistent.

... which further leads me to conclude that rowversion columns are not the best choice for implementing optimistic locking, which sadly is their sole purpose. I would be much better off using a manually incremented int column, with a query like:

UPDATE dbo.Thing
SET Version = @version + 1
WHERE
    Id = @id
    AND Version = @version;

The Version column is checked and incremented in a single atomic operation, so there's no chance for other statements to slip in-between. I don't have to ask the database what the new value is, because I told it what the new value is. As long as the Version column contains the value I'm expecting (and assuming all other people updating this row are also playing by the rules - correctly incrementing Version), I can know that the Thing is still exactly as I left it. At least, I think...

like image 98
Daniel Schilling Avatar answered Nov 01 '22 11:11

Daniel Schilling