I am wondering. I have a complex query which runs in a SQL Server 2005 Express edition in around 3 seconds.
The main table has around 300k rows.
When I add
ROW_NUMBER() OVER (ORDER BY date_column)
it takes 123 seconds while date_column
is a datetime
column.
If I do
ROW_NUMBER() OVER (ORDER BY string_title)
it runs in 3 seconds again.
I added an index on the datetime
column. No change. Still 123 seconds.
Then I tried:
ROW_NUMBER() OVER (ORDER BY CAST(date_column AS int))
and the query runs in 3 seconds again.
Since casting needs time, why does SQL Server behave like this???
UPDATE: It seems like ROW_NUMBER ignore my WHERE statements at all and build a row column list for all available entries? Can anyone confirm that ?
Here I copied a better read able (still tonz of logic :)) in the SQL Management Studio:
SELECT ROW_NUMBER() OVER (ORDER BY xinfobase.lid) AS row_num, *
FROM xinfobase
LEFT OUTER JOIN [xinfobasetree] ON [xinfobasetree].[lid] = [xinfobase].[xlngfolder]
LEFT OUTER JOIN [xapptqadr] ON [xapptqadr].[lid] = [xinfobase].[xlngcontact]
LEFT OUTER JOIN [xinfobasepvaluesdyn] ON [xinfobasepvaluesdyn].[lparentid] = [xinfobase].[lid]
WHERE (xinfobase.xlngisdeleted=2
AND xinfobase.xlinvalid=2)
AND (xinfobase.xlngcurrent=1)
AND ( (xinfobase.lownerid = 1
OR (SELECT COUNT(lid)
FROM xinfobaseacl
WHERE xinfobaseacl.lparentid = xinfobase.lid
AND xlactor IN(1,-3,-4,-230,-243,-254,-255,-256,-257,-268,-589,-5,-6,-7,-8,-675,-676,-677,-9,-10,-864,-661,-671,-913))>0
OR xinfobasetree.xlresponsible = 1)
AND (xinfobase.lid IN (SELECT lparentid
FROM xinfobasealt a, xinfobasetree t
WHERE a.xlfolder IN(1369)
AND a.xlfolder = t.lid
AND dbo.sf_MatchRights(1, t.xtxtrights,'|')=1 )) )
AND ((SELECT COUNT(*) FROM dbo.fn_Split(cf_17,',')
WHERE [value] = 39)>0)
This query need 2-3 seconds on 300k records.
Now I changed the ORDER BY
to xinfobase.xstrtitle
then it runs in around 2-3 seconds again.
If I switch to xinfobase.dtedit
(datetime column with an additional index I just added) it needs hte time I mentioned above already.
I also tried to "cheat" and made my statement as a SUB SELECT to force him to retriev the records first and do a ROW_NUMBER()
outside in another SQL statement, same performance result.
The ORDER BY clause determines the sequence in which the rows are assigned their unique ROW_NUMBER within a specified partition. It is required.
The Row_Number function is used to provide consecutive numbering of the rows in the result by the order selected in the OVER clause for each partition specified in the OVER clause. It will assign the value 1 for the first row and increase the number of the subsequent rows. OVER - Specify the order of the rows.
In my experience, an aggregate (DISTINCT or GROUP BY) can be quicker then a ROW_NUMBER() approach.
The ROW_NUMBER function cannot currently be used in a WHERE clause. Derby does not currently support ORDER BY in subqueries, so there is currently no way to guarantee the order of rows in the SELECT subquery.
UPDATE
After I was still frustrated about doing a workaround I was investigating more. I removed all my existing indexes and run several SQL statements against the tables. It turns out, that building new indexes with a new sortorder of columns and include different columns I fixed my issue and the query is fast with dtedit (datetime) column as well.
So lessons learned: Take more care of your indexes and execution plans and recheck them with every update (new version) of the software you produce...
But still wonderung why CAST(datetime_column AS int) makes it fast before...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With