I have a simple select statement which selects data from a SQL Server 2000 (so old) table with about 10-20 million rows like this -
@startDate = '2014-01-25' -- yyyy-mm-dd
@endDate = '2014-02-20'
SELECT
Id, 6-7 other columns
FROM
Table1 as t1
LEFT OUTER JOIN
Table2 as t2 ON t1.Code = t2.Code
WHERE
t1.Id = 'G59' -- yes, its a varchar
AND (t1.Entry_Date >= @startDate AND t1.Entry_Date < @endDate)
This gives me about 40 K rows in about 10 seconds. But, if I set @startDate = '2014-01-30', keeping @endDate same ALWAYS, then the query takes about 2 min 30 sec
To produce the same number of rows, I tried it with 01-30 again and it took 2 min 48 seconds.
I am surprised to see the difference. I was not expecting the difference to be so big. Rather, I was expecting it to take the same time or lesser for a smaller date range.
What could be the reason for this and how do I fix it ?
Have you recently inserted and/or deleted a large number of rows? It could be that the statistics on the table's indices are out of date, and thus the query optimizer will go for a "index seek + key lookup" scenario on the smaller date range - but that turns out to be slower than just doing a table/clustered index scan.
I would recommend to update the statistics (see this TechNEt article on how to update the statistics) and try again - any improvement?
The query optimizer uses statistics to determine whether it's faster to just do a table scan (just read all the table's data pages and select the rows that match), or whether it's faster to search for the search value in an index; that index typically doesn't contain all the data - so once a match is found, a key lookup needs to be performed on the table to get at the data - which is an expensive operation, so it's only viable for small sets of data. If out-of-date statistics "mislead" the query optimizer, it might choose a suboptimal execution plan
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With