Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When are TSQL Cursors the best or only option?

I'm having this argument about using Cursors in TSQL recently...

First of all, I'm not a cheerleader in the debate. But every time someone says cursor, there's always some knucklehead (or 50) who pounce with the obligatory 'cursors are evil' mantra. I know SQL-Server was optimized for set-based operations, and maybe cursors truly ARE evil incarnate, but if I wanted to put some objective thought behind that...

Here's where my mind is going:

  1. Is the only difference between cursors and set operations one of performance?

    Edit: There's been a good case made for it not being simply a matter of performance -- such as running a single batch over-and-over for a list of id's, or alternatively, executing actual SQL text stored in a table field row-by-row.

  2. Follow-up: do cursors always perform worse?

    • EDIT: @Martin shows a good case where Cursors out-perform set-based operations fairly dramatically. I suspect that this wouldn't be the kind of thing you'd do too often (before you resorted to some kind of OLAP / Data Warehouse kind of solution), but nonetheless, seems like a case where you really couldn't live without a cursor.
    • reference to TPC benchmarks suggesting cursors may be more competitive than folks generally believe.
    • reference to memory-usage optimizations for cursors since Sql-Server 2005
  3. Are there any problems you can think of, that cursors are better suited to solve than set-based operations?

    • EDIT: Set-based operations literally cannot Execute stored procedures, etc. (see edit for item 1 above).
    • EDIT: Set-based operations are exponentially slower than row-by-row when it comes to aggregating over large data sets.

  • Article from MSDN explaining their perspective of the most common problems people resort to cursors for (and some explanation of set-based techniques that would work better.)
  • Microsoft says (vaguely) in the 2008 Transact SQL Reference on MSDN: "...there are times when the results are best processed one row at a time", but the don't give any examples as to what cases they're referring to.

Mostly, I'm of a mind to convert cursors to set-based operations in my old code if/as I do any significant upgrades to various applications, as long as there's something to be gained from it. (I tend toward laziness over purity a lot of the time -- i.e., if it ain't broke, don't fix it.)

like image 519
Chains Avatar asked Dec 22 '22 10:12

Chains


1 Answers

To answer your question directly:

I have yet to encounter a situation where set operations could not do what might otherwise be done with cursors. However, there are situations where using cursors to break a large set problem down into more manageable chunks proves a better solution for purposes of code maintainability, logging, transaction control, and the like. But I doubt there are any hard-and-fast rules to tell you what types of requirements would lead to one solution or the other -- individual databases and needs are simply far too variant.

That said, I fully concur with your "if it ain't broke, don't fix it" approach. There is little to be gained by refactoring procedural code to set operations for a procedure that is working just fine. However, it is a good rule of thumb to seek first for a set-based solution and only drop into procedural code when you must. Gut feel? If you're using cursors more than 20% of the time, you're doing something wrong.

And for what I really want to say:

When I interview programmers, I always throw them a couple of moderately complex SQL questions and ask them to explain how they'd solve them. These are problems that I know can be solved with set operations, and I'm specifically looking for candidates who are able to solve them without procedural approaches (i.e., cursors).

This is not because I believe there is anything inherently good or more performant in either approach -- different situations yield different results. Rather it's because, in my experience, programmers either get the concept of set-based operations or they do not. If they do not, they will spend too much time developing complex procedural solutions for problems that can be solved far more quickly and simply with set-based operations.

Conversely, a programmer who gets set-based operations almost never has problems implementing a procedural solution when, indeed, it's absolutely necessary.

like image 97
Michael Ames Avatar answered Mar 03 '23 06:03

Michael Ames