Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing ROW_NUMBER() in SQL Server

We have a number of machines which record data into a database at sporadic intervals. For each record, I'd like to obtain the time period between this recording and the previous recording.

I can do this using ROW_NUMBER as follows:

WITH TempTable AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Machine_ID ORDER BY Date_Time) AS Ordering
    FROM dbo.DataTable
)

SELECT [Current].*, Previous.Date_Time AS PreviousDateTime
FROM TempTable AS [Current]
INNER JOIN TempTable AS Previous 
    ON [Current].Machine_ID = Previous.Machine_ID
    AND Previous.Ordering = [Current].Ordering + 1

The problem is, it goes really slow (several minutes on a table with about 10k entries) - I tried creating separate indicies on Machine_ID and Date_Time, and a single joined-index, but nothing helps.

Is there anyway to rewrite this query to go faster?

like image 713
BlueRaja - Danny Pflughoeft Avatar asked Jun 02 '10 17:06

BlueRaja - Danny Pflughoeft


4 Answers

The given ROW_NUMBER() partition and order require an index on (Machine_ID, Date_Time) to satisfy in one pass:

CREATE INDEX idxMachineIDDateTime ON DataTable (Machine_ID, Date_Time);

Separate indexes on Machine_ID and Date_Time will help little, if any.

like image 131
Remus Rusanu Avatar answered Sep 26 '22 16:09

Remus Rusanu


How does it compare to this version?:

SELECT x.*
    ,(SELECT MAX(Date_Time)
      FROM dbo.DataTable
      WHERE Machine_ID = x.Machine_ID
          AND Date_Time < x.Date_Time
    ) AS PreviousDateTime
FROM dbo.DataTable AS x

Or this version?:

SELECT x.*
    ,triang_join.PreviousDateTime
FROM dbo.DataTable AS x
INNER JOIN (
    SELECT l.Machine_ID, l.Date_Time, MAX(r.Date_Time) AS PreviousDateTime
    FROM dbo.DataTable AS l
    LEFT JOIN dbo.DataTable AS r
    ON l.Machine_ID = r.Machine_ID
        AND l.Date_Time > r.Date_Time
    GROUP BY l.Machine_ID, l.Date_Time
) AS triang_join
ON triang_join.Machine_ID = x.Machine_ID
    AND triang_join.Date_Time = x.Date_Time

Both would perform best with an index on Machine_ID, Date_Time and for correct results, I'm assuming that this is unique.

You haven't mentioned what is hidden away in * and that can sometimes means a lot since a Machine_ID, Date_Time index will not generally be covering and if you have a lot of columns there or they have a lot of data, ...

like image 36
Cade Roux Avatar answered Sep 22 '22 16:09

Cade Roux


If the number of rows in dbo.DataTable is large then it is likely that you are experiencing the issue due to the CTE self joining onto itself. There is a blog post explaining the issue in some detail here

Occasionally in such cases I have resorted to creating a temporary table to insert the result of the CTE query into and then doing the joins against that temporary table (although this has usually been for cases where a large number of joins against the temp table are required - in the case of a single join the performance difference will be less noticable)

like image 28
Paul McLoughlin Avatar answered Sep 22 '22 16:09

Paul McLoughlin


I have had some strange performance problems using CTEs in SQL Server 2005. In many cases, replacing the CTE with a real temp table solved the problem.

I would try this before going any further with using a CTE.

I never found any explanation for the performance problems I've seen, and really didn't have any time to dig into the root causes. However I always suspected that the engine couldn't optimize the CTE in the same way that it can optimize a temp table (which can be indexed if more optimization is needed).

Update

After your comment that this is a view, I would first test the query with a temp table to see if that performs better.

If it does, and using a stored proc is not an option, you might consider making the current CTE into an indexed/materialized view. You will want to read up on the subject before going down this road, as whether this is a good idea depends on a lot of factors, not the least of which is how often the data is updated.

like image 24
Phil Sandler Avatar answered Sep 24 '22 16:09

Phil Sandler