Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Snapshot Isolation limits

For my database application, employing snapshot isolation for certain of its queries' transactions seems perfect for solving one of the critical requirements.

I'm worried, though, how choosing snapshot isolation (which I believe must be enabled database-wide) now will bite us once we start getting very high volumes. What is the cost of snapshot isolation? Is it a fixed cost, linear, or geometric?

If I'm right to be concerned about high volumes, are there strategies/patterns for application-level functionality similar to snapshot isolation that might have better overall performance, but take more time/expertise to implement?

Thanks,

Jason

like image 206
Jason Kleban Avatar asked Dec 31 '09 21:12

Jason Kleban


People also ask

What is snapshot isolation level in SQL Server?

SNAPSHOT isolation specifies that data read within a transaction will never reflect changes made by other simultaneous transactions. The transaction uses the data row versions that exist when the transaction begins.

What is the advantage of snapshot isolation over 2 phase locking?

Performance of serializable snapshot isolation # The advantage that this has over two-phase locking that one transaction does not need to be blocked when waiting for locks held by another transaction.


2 Answers

For anyone who's not already an expert on locking and database implementations, this can be a surprisingly difficult subject to wrap your mind around.

I highly recommend reading this series of posts on snapshot isolation by Hugo Kornelis (SQL Server MVP). So far it's the most complete analysis I've seen of practical considerations when using snapshots.

To summarize the main issues:

  • When a particular combination of concurrent transactions would make it possible to violate a constraint (UNIQUE, FOREIGN KEY, etc.), SQL Server will fall back to the old way of doing things. This is good for reliability, obviously, but not for performance. Snapshots aren't a panacea, they aren't a replacement for good database/query design and intelligent lock management.
  • Snapshots and triggers may not play nice together. It's especially dangerous if you use triggers to protect data integrity, but even if you don't, pretty much all of your triggers will have to be made snapshot-aware.

Depending on how you write your queries, you may not even need to be using triggers in order to end up with unexpected or inconsistent results.

I don't know if the costs are fixed or linear, although they're definitely not geometric; I do know that it's a bit of a headache regardless. It's often talked about as a fire-and-forget option, but the truth is, it's not, if you don't know what you're doing you can end up with breaking changes (ones that you probably won't find out about until it's too late!).

By all means do use it if you're sure that it won't cause any other problems. But if any of your logic doesn't care about dirty reads (and this applies to more than half the SELECT queries in many systems), you'll get far better results with READ UNCOMMITTED (which does involve more "expertise" - you have to think very carefully about what can happen and when).

Update: On application-level alternatives

The only one that springs to mind is caching. Some data frameworks can do this for you (NHibernate, EF), and in some cases you might even have a 3rd tier of caching, such as web services that cache results based on the message input, results that might be based on several queries. I wouldn't really call this an "alternative" but I imagine that some form of caching would be effective in your case if these are read-only queries and the underlying data doesn't change frequently. The design consideration is, of course, how much you data you can afford to cache relative to the amount you need to serve; if the system is massively concurrent then this might not scale.

Beyond that, I personally would not choose to try to implement my own app-level transactional "tier". Maybe some people have done this, but I don't think there's any way that my limited experience can compete with hundreds or thousands of the brightest designers working on a DBMS for 20 years.

like image 68
Aaronaught Avatar answered Sep 22 '22 03:09

Aaronaught


Snapshot isolation is meant to be more read performant than other isolation levels. By segregating the data into a snapshot, the transaction does not need to acquire locks on the rows, which prevents blocking and deadlocks.

However, it does have to write row versioning information into the tempdb database. So for each transaction, some write-time should be expected.

Just like everything else, your circumstances will dictate whether or not this will be more or less performant for you. If your application is OLTP style, then it could be a large increase in performance if your transactions are prone to deadlocking.

like image 23
womp Avatar answered Sep 19 '22 03:09

womp