Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ways to avoid eager spool operations on SQL Server

People also ask

What is eager spool in SQL Server?

The role of the Eager Spool is to catch all the rows received from another operator and store these rows in TempDB. The word “eager” means that the operator will read ALL rows from the previously operator at one time. In other words, it will take the entire input, storing each row received.

What is lazy spool in SQL Server?

SQL Server Table Spool Operator (Lazy Spool) Te SQL Server Lazy Spool is used to build a temporary table on the TempDB and fill it in lazy manner. In other words, it fills the table by reading and storing the data only when individual rows are required by the parent operator.

Where can I use option recompile?

WITH RECOMPILE Option If this option is used when the procedure definition is created, it requires CREATE PROCEDURE permission in the database and ALTER permission on the schema in which the procedure is being created. If this option is used in an EXECUTE statement, it requires EXECUTE permissions on the procedure.


My understanding of spooling is that it's a bit of a red herring on your execution plan. Yes, it accounts for a lot of your query cost, but it's actually an optimization that SQL Server undertakes automatically so that it can avoid costly rescanning. If you were to avoid spooling, the cost of the execution tree it sits on will go up and almost certainly the cost of the whole query would increase. I don't have any particular insight into what in particular might cause the database's query optimizer to parse the execution that way, especially without seeing the SQL code, but you're probably better off trusting its behavior.

However, that doesn't mean your execution plan can't be optimized, depending on exactly what you're up to and how volatile your source data is. When you're doing a SELECT INTO, you'll often see spooling items on your execution plan, and it can be related to read isolation. If it's appropriate for your particular situation, you might try just lowering the transaction isolation level to something less costly, and/or using the NOLOCK hint. I've found in complicated performance-critical queries that NOLOCK, if safe and appropriate for your data, can vastly increase the speed of query execution even when there doesn't seem to be any reason it should.

In this situation, if you try READ UNCOMMITTED or the NOLOCK hint, you may be able to eliminate some of the Spools. (Obviously you don't want to do this if it's likely to land you in an inconsistent state, but everyone's data isolation requirements are different). The TOP operator and the OR operator can occasionally cause spooling, but I doubt you're doing any of those in an ETL process...

You're right in saying that your UDFs could also be the culprit. If you're only using each UDF once, it would be an interesting experiment to try putting them inline to see if you get a large performance benefit. (And if you can't figure out a way to write them inline with the query, that's probably why they might be causing spooling).

One last thing I would look at is that, if you're doing any joins that can be re-ordered, try using a hint to force the join order to happen in what you know to be the most selective order. That's a bit of a reach but it doesn't hurt to try it if you're already stuck optimizing.