I am trying to understand how to safely increment a counter column, that may be incremented simultaneously by many users (It's a Web API for a mobile app).
I've read the popular questions in SO for strategies dealing with the issue but I can't seem to figure what's wrong with using a simple:
UPDATE Table SET Counter = Counter + 1
I've built the following code sample to try and get inconsistent values and prove myself that using only this simple update statement is not good practice:
class Program
{
static void Main(string[] args)
{
List<Task> tasks = new List<Task>();
for (int i = 0; i < 100; i++)
{
Task t = Task.Factory.StartNew(() =>
{
WriteToCounter();
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
static void WriteToCounter()
{
string connString = ConfigurationManager.ConnectionStrings["DefaultConnection"].ConnectionString;
using (SqlConnection connection = new SqlConnection(connString))
{
connection.Open();
Random rnd = new Random();
for (int i = 1; i <= 100; i++)
{
int wait = rnd.Next(1, 3);
Thread.Sleep(wait);
string sql = "UPDATE Table SET Counter = Counter + 1";
SqlCommand command = new SqlCommand(sql, connection);
command.ExecuteNonQuery();
}
}
}
}
In the sample I am trying to simulate a scenario in which many users access the API simultaneously and update the counter. When the code runs, the counter is always at 10000, which means it is consistent.
Does the test correctly simulates the scenario I described?
And if so, how come the I can use the update statement without any special locking/transaction strategies and still get consistent results?
If you only ever use it as simple as this, you're fine.
The problems start when:
Counter
, that's a great way to lose determinismTransactionScope
)Counter
being a unique auto-incrementing identifier. It obviously doesn't work if you separate the select
and update
(and no, update
based on select
doesn't help - unlike plain update
, the select
isn't serialized with updates on the same row; that's where locking hints come in), I'm not sure if using output
is safe.And of course, things might be quite different if the transaction isolation level changes. This is actually a legitimate cause of errors, because SQL connection pooling doesn't reset the transaction isolation level, so if you ever change it, you need to make sure it can't ever affect any other SQL you execute on a SqlConnection
taken out of the pool.
how come the I can use the update statement without any special locking/transaction strategies and still get consistent results?
Because you get a lot of these features automatically when you're working with a database that offers ACID guarantees.
For instance, every DML query runs inside a transaction. In SQL Server, the default is for it to run in autocommit mode. In this mode, if you execute a query and there is no open transaction, it creates a new one. If the query completes without error, it automatically commits the transaction. In an alternative mode called implicit transactions, it will still automatically create a new transaction if there's no open transaction, but it leaves it up to the user whether to actually perform the commit.
As to locks, there's a fair bit of complexity here as well. There are various forms of locks, trying to achieve trade-offs between allowing concurrency whilst preventing inconsistencies from arising. And, in fact, SQL Server has a dedicated type of lock, just for UPDATE
s, that is designed to ensure that two parallel attempts to UPDATE
the same resources will get correctly serialized (rather than allowing the attempts to overlap and to potentially deadlock).
So, long answer short, the UPDATE
you show in your question is perfectly valid.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With