I have some legacy SQL (SP)
declare @FactorCollectionId int; select @FactorCollectionId = collectionID from dbo.collection where name = 'Factor'
declare @changeDate datetime; set @changeDate = getDate()
declare @changeTimeID int; set @changeTImeID = convert(int, convert(varchar(8), @changeDate, 112))
declare @MaxWindowID int; select @MaxWindowID = MAX(windowID) from dbo.window
select distinct @FactorCollectionId, ElementId, T.TimeID, @changeTimeId ChangeTimeID, 1 UserID, @MaxWindowID, 0 ChangeID
, null TransactionID, SystemSourceID, changeTypeID, 'R' OlapStatus, Comment, Net0 Delta0, Net0
, 1 CreatedBy, 1 UpdatedBy, @changeDate CreatedDate, @changeDate UpdatedDate, 1 CurrentRecord, MeasureTypeID
from dbo.aowCollectedFact FV
inner join dbo.timeView T on T.timeID >= FV.timeID
where FV.currentRecord = 1 --is current record
and T.CurrentHorizon <> 0 --Indicator that Time is part of current horizon
and FV.collectionID = @FactorCollectionId --factor collections only
and FV.timeID = (select MAX(timeID) --latest collected fact timeID for given collectionID and elementID
from aowCollectedFact FV2
where FV2.collectionId = @FactorCollectionId
and FV2.elementId = FV.elementID)
and (((T.ForecastLevel = 'Month') and (T.FirstDayInMonth = T.Date)) --Date is first of month for monthly customers, or
or
((T.ForecastLevel = 'Quarter')and (T.FirstDayInQuarter = T.Date))) --Date is first of quarter for quarterly customers
and not exists (select 1 --Record does not already exist in collected fact view
from aowCollectedFact FV3 -- for this factor collection, elementID, and timeID
where FV3.collectionId = @FactorCollectionId
and FV3.elementID = FV.elementId
and FV3.timeID = T.timeID)
This SQL processes over 2 million rows. I need to improve its performance. When I look at the execution plan I find that a lot of time is spent on a Table Spool (Lazy spool)
operation (indexes exist in tables and they work well).
How to improve performance for this part ?
One way we can reduce the join impact is reduce the amount of data that is being joined. First we will look at the left outer join towards the end of the query. The cost is 37% and it is joining data from the primary key index and data that has previously been joined and worked with.
SQL Server Table Spool Operator (Lazy Spool) Te SQL Server Lazy Spool is used to build a temporary table on the TempDB and fill it in lazy manner. In other words, it fills the table by reading and storing the data only when individual rows are required by the parent operator.
A spool is basically a temporary table created within the execution of the query that is used when it's likely that data will be needed again, and again during the execution of the query.
The role of the Eager Spool is to catch all the rows received from another operator and store these rows in TempDB. The word “eager” means that the operator will read ALL rows from the previously operator at one time. In other words, it will take the entire input, storing each row received.
Before seeing the execution plan or table indices, I'll give best educated guesses. First, here are a couple links worth reading.
showplan operator of the week - lazy spool
Table spool/Lazy spool
INDEXING: Take a look at your indices to make sure that they're all covering the columns that you're selecting out of the tables. You'll want to aim to get all the columns included in JOINs and WHERE clauses within the indices. All other columns that are in the SELECT statements should be INCLUDEd, or covered, by the index.
OPERATORS: See if you can get rid of the not equals ("<>") operators, in favor of a single greater than or less than operator. Can this statement and T.CurrentHorizon <> 0
be changed to this and T.CurrentHorizon > 0
?
JOINS: Get rid of the subqueries that are JOINing to tables outside of themselves. For instance, this line and FV2.elementId = FV.elementID
might be causing some problems. There's no reason you can't move that out of a subquery and into a JOIN to dbo.aowCollectedFact FV
, given that you're GROUPing (DISTINCT) in the main query already.
DISTINCT: Change it to a GROUP BY. I've got no reason other than, because it's good practice and takes two minutes.
LAST NOTE: The exception to all the above might be to leave the final subquery, the IF NOT EXISTS
, as a subquery. If you change it to a JOIN, it'll have to be a LEFT JOIN...WHERE NULL
statement, which can actually cause spooling operations. No great way to get around that one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With