Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the real cost of a SQL transaction?

This requires a bit of background. I'm creating a web app (ASP.NET/VB.NET with SQL Server backend) that will accept an application to receive funding, and this form has contact info fields for three people. Because the information is the same for all three people (name, email, phone), I chose to store contact info in a table separate from the applications. In the application table, there is an ID field for each of these three contacts that points to a contact in the contact table.

However, this presents an interesting question/problem in doing my CRUD stuff. The only ways I can think of to create, update, and retrieve information for both the application and all three contacts in one SQL transaction require very complicated stored procedures. In other words, it would be much simpler (for me) to retrieve this information via multiple transactions. However, since this information is never needed independently, I would ALWAYS be doing multiple transactions to get information for one application.

So my questions:

  1. Is this design overkill? There will never be more than three contact people per application, and there MUST be exactly three. Am I over-complicating by removing this information to a separate table?
  2. What is the real cost of doing several SQL transactions vs. writing my complex stored procedures and only needing one transaction?
  3. Generally speaking, what is the cost of a SQL transaction to a web application using ADO.NET?

Thanks for bearing with that long-winded explanation.

*EDIT*

After reading some of your responses, it appears I am using the term "transaction" wrong. What I am actually curious about is the cost of executing multiple queries across a single connection, versus doing one query. Sorry for the misunderstanding.

like image 242
Ender Avatar asked Aug 31 '25 22:08

Ender


2 Answers

Transactions are needed to move the database from one consistent state into another consistent state. The 'consistent' here applies to the application view of the database. The typical example is the money transfer between two accounts: you have to debit one account and credit another account. In between these two operations the database is inconsistent, there is some money that have 'vanished' (the sum debited from one account is nowhere). But at the end of the transaction, the database is again consistent. You need a transaction to span these two operations in order to protect the readers from viewing an inconsistent database, and in order to ensure a consistent database in case of a crash.

You say that in order to process multiple items in your logic in a single transaction you require complicated procedures. That is not true, the transaction scope is orthogonal to the request scope: a client can start a transaction, do 3 operations by invoking 3 procedures in 3 calls, then commit the transaction. It is not required that all the operations be done in one single stored procedure.

So transaction should not induce significant procedure overhead. In fact, a procedure would be oblivious to transactions. A well written procedure should behave correctly when it's invoked inside of a transaction as well as when it's invoked w/o a transaction. See Exception handling and nested transactions for a template for procedures that behave correctly in presence of transactions and exceptions.

An finally, what is the real cost of a transaction? Transaction write data (read aren't really needing transactions) and as such they lock the rows modified. Reads normally block when attempting to read these blocked rows, and then the longer a transaction is, the longer it locks rows, and the more reads it blocks. But there is a really simple solution: snapshot reads. Snapshot reads are really a magic pixie dust, they allow applications to go ahead unhindered by blocked rows, because a reader can always read the row version prior to the update that is blocking the row. See Using Row Versioning-based Isolation Levels.

So the conclusion is simple: transactions have no cost. Simply because there is no alternative to transactions. Is not an issue of 'X is slower that Y', is an issue of 'X is the only correct alternative'.

Updated

After your edit: the cost oh having multiple requests vs. one requests can be significant. A round-trip to the server (ie. issuing a request on an open connection) has a fixed cost. If you do multiple round-trips then you pay this fixed cost on each request. If you do one single request with multiple procedure invocations then this fixed cost is only payed once. On very hot systems, this difference is measurable and has a cost on overall performance. But I'm talking about really hot systems, as in thousands of requests per second. The solution is usually not to complicate the procedures, but issue multiple procedure calls in one single request:

SqlCommand cmd = new SqlCommand(@"
exec usp_firstProc @arg1, @arg2;
exec usp_secondProc @arg2, @arg3;
exec usp_thirdProc @arg1, @arg3;", connection);
cmd.Parameters.Add("@arg1", someValue);
...
cmd.ExecuteNonQuery();

But I must say that this solution is not really required unless you have a really hot workload. As a back-of-the-envelop rule, for anything under 1000 requests/sec, I would consider the benefits of clear code to outweigh the performance benefits.

One thing different though if you have to open a new connection for each request. The login handshake is really expensive, can be measured in hundreds of ms. But the solution is trivial: use connection pooling (which is enabled by default in ADO.Net) and don't discard connections in the app prematurely, keep the connection and reuse it until the entire unit of work is finished.

like image 140
Remus Rusanu Avatar answered Sep 03 '25 17:09

Remus Rusanu


  1. This is not overkill. This is a good normalization and a good application of the Don't-Repeat-Yourself principle. The requirements of the system may change and you may need 2 or 4 contacts in the future instead of 3.
  2. You don't need to do any complex stored procedures. You're not doing anything out of the ordinary here. You don't need stored procedures at all.
  3. Don't know.

When you say "transaction" I think you just mean "query." When I wanted to get the data for one application, I would use two separate, simple queries:

SELECT *
  FROM application

and

SELECT *
  FROM contact

It's two separate queries but big deal. The more important part is designing the database in a way that makes sense.

like image 31
Jason Swett Avatar answered Sep 03 '25 18:09

Jason Swett