This requires a bit of background. I'm creating a web app (ASP.NET/VB.NET with SQL Server backend) that will accept an application to receive funding, and this form has contact info fields for three people. Because the information is the same for all three people (name, email, phone), I chose to store contact info in a table separate from the applications. In the application table, there is an ID field for each of these three contacts that points to a contact in the contact table.
However, this presents an interesting question/problem in doing my CRUD stuff. The only ways I can think of to create, update, and retrieve information for both the application and all three contacts in one SQL transaction require very complicated stored procedures. In other words, it would be much simpler (for me) to retrieve this information via multiple transactions. However, since this information is never needed independently, I would ALWAYS be doing multiple transactions to get information for one application.
So my questions:
Thanks for bearing with that long-winded explanation.
*EDIT*
After reading some of your responses, it appears I am using the term "transaction" wrong. What I am actually curious about is the cost of executing multiple queries across a single connection, versus doing one query. Sorry for the misunderstanding.
Transactions are needed to move the database from one consistent state into another consistent state. The 'consistent' here applies to the application view of the database. The typical example is the money transfer between two accounts: you have to debit one account and credit another account. In between these two operations the database is inconsistent, there is some money that have 'vanished' (the sum debited from one account is nowhere). But at the end of the transaction, the database is again consistent. You need a transaction to span these two operations in order to protect the readers from viewing an inconsistent database, and in order to ensure a consistent database in case of a crash.
You say that in order to process multiple items in your logic in a single transaction you require complicated procedures. That is not true, the transaction scope is orthogonal to the request scope: a client can start a transaction, do 3 operations by invoking 3 procedures in 3 calls, then commit the transaction. It is not required that all the operations be done in one single stored procedure.
So transaction should not induce significant procedure overhead. In fact, a procedure would be oblivious to transactions. A well written procedure should behave correctly when it's invoked inside of a transaction as well as when it's invoked w/o a transaction. See Exception handling and nested transactions for a template for procedures that behave correctly in presence of transactions and exceptions.
An finally, what is the real cost of a transaction? Transaction write data (read aren't really needing transactions) and as such they lock the rows modified. Reads normally block when attempting to read these blocked rows, and then the longer a transaction is, the longer it locks rows, and the more reads it blocks. But there is a really simple solution: snapshot reads. Snapshot reads are really a magic pixie dust, they allow applications to go ahead unhindered by blocked rows, because a reader can always read the row version prior to the update that is blocking the row. See Using Row Versioning-based Isolation Levels.
So the conclusion is simple: transactions have no cost. Simply because there is no alternative to transactions. Is not an issue of 'X is slower that Y', is an issue of 'X is the only correct alternative'.
Updated
After your edit: the cost oh having multiple requests vs. one requests can be significant. A round-trip to the server (ie. issuing a request on an open connection) has a fixed cost. If you do multiple round-trips then you pay this fixed cost on each request. If you do one single request with multiple procedure invocations then this fixed cost is only payed once. On very hot systems, this difference is measurable and has a cost on overall performance. But I'm talking about really hot systems, as in thousands of requests per second. The solution is usually not to complicate the procedures, but issue multiple procedure calls in one single request:
SqlCommand cmd = new SqlCommand(@"
exec usp_firstProc @arg1, @arg2;
exec usp_secondProc @arg2, @arg3;
exec usp_thirdProc @arg1, @arg3;", connection);
cmd.Parameters.Add("@arg1", someValue);
...
cmd.ExecuteNonQuery();
But I must say that this solution is not really required unless you have a really hot workload. As a back-of-the-envelop rule, for anything under 1000 requests/sec, I would consider the benefits of clear code to outweigh the performance benefits.
One thing different though if you have to open a new connection for each request. The login handshake is really expensive, can be measured in hundreds of ms. But the solution is trivial: use connection pooling (which is enabled by default in ADO.Net) and don't discard connections in the app prematurely, keep the connection and reuse it until the entire unit of work is finished.
When you say "transaction" I think you just mean "query." When I wanted to get the data for one application, I would use two separate, simple queries:
SELECT *
FROM application
and
SELECT *
FROM contact
It's two separate queries but big deal. The more important part is designing the database in a way that makes sense.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With