Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Atomic increment of counter column using simple update

I am trying to understand how to safely increment a counter column, that may be incremented simultaneously by many users (It's a Web API for a mobile app).

I've read the popular questions in SO for strategies dealing with the issue but I can't seem to figure what's wrong with using a simple:

UPDATE Table SET Counter = Counter + 1  

I've built the following code sample to try and get inconsistent values and prove myself that using only this simple update statement is not good practice:

class Program
{
    static void Main(string[] args)
        {
            List<Task> tasks = new List<Task>();

            for (int i = 0; i < 100; i++)
            {
                Task t = Task.Factory.StartNew(() =>
                {
                    WriteToCounter();
                });

                tasks.Add(t);
            }

            Task.WaitAll(tasks.ToArray());
        }

    static void WriteToCounter()
        {
            string connString = ConfigurationManager.ConnectionStrings["DefaultConnection"].ConnectionString;

            using (SqlConnection connection = new SqlConnection(connString))
            {
                connection.Open();
                Random rnd = new Random();
                for (int i = 1; i <= 100; i++)
                {
                    int wait = rnd.Next(1, 3);
                    Thread.Sleep(wait);

                    string sql = "UPDATE Table SET Counter = Counter + 1";

                    SqlCommand command = new SqlCommand(sql, connection);
                    command.ExecuteNonQuery();
                }
            }
        }
}

In the sample I am trying to simulate a scenario in which many users access the API simultaneously and update the counter. When the code runs, the counter is always at 10000, which means it is consistent.

Does the test correctly simulates the scenario I described?
And if so, how come the I can use the update statement without any special locking/transaction strategies and still get consistent results?

like image 918
Yaron Levi Avatar asked Apr 29 '15 12:04

Yaron Levi


2 Answers

If you only ever use it as simple as this, you're fine.

The problems start when:

  • You add a condition - most conditions are fine, but avoid filtering based on Counter, that's a great way to lose determinism
  • You update inside of a transaction (careful about this - it's easy to be in a transaction outside of the scope of the actual update statement, even more so if you use e.g. TransactionScope)
  • You combine inserts and updates (e.g. the usual "insert if not exists" pattern) - this is not a problem if you only have a single counter, but for multiple counters it's easy to fall into this trap; not too hard to solve, unless you also have deletes, then it becomes a whole different league :)
  • Maybe if you rely on the value of Counter being a unique auto-incrementing identifier. It obviously doesn't work if you separate the select and update (and no, update based on select doesn't help - unlike plain update, the select isn't serialized with updates on the same row; that's where locking hints come in), I'm not sure if using output is safe.

And of course, things might be quite different if the transaction isolation level changes. This is actually a legitimate cause of errors, because SQL connection pooling doesn't reset the transaction isolation level, so if you ever change it, you need to make sure it can't ever affect any other SQL you execute on a SqlConnection taken out of the pool.

like image 155
Luaan Avatar answered Sep 23 '22 17:09

Luaan


how come the I can use the update statement without any special locking/transaction strategies and still get consistent results?

Because you get a lot of these features automatically when you're working with a database that offers ACID guarantees.

For instance, every DML query runs inside a transaction. In SQL Server, the default is for it to run in autocommit mode. In this mode, if you execute a query and there is no open transaction, it creates a new one. If the query completes without error, it automatically commits the transaction. In an alternative mode called implicit transactions, it will still automatically create a new transaction if there's no open transaction, but it leaves it up to the user whether to actually perform the commit.

As to locks, there's a fair bit of complexity here as well. There are various forms of locks, trying to achieve trade-offs between allowing concurrency whilst preventing inconsistencies from arising. And, in fact, SQL Server has a dedicated type of lock, just for UPDATEs, that is designed to ensure that two parallel attempts to UPDATE the same resources will get correctly serialized (rather than allowing the attempts to overlap and to potentially deadlock).

So, long answer short, the UPDATE you show in your question is perfectly valid.

like image 42
Damien_The_Unbeliever Avatar answered Sep 21 '22 17:09

Damien_The_Unbeliever