Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting SUM of TOP 2 values within a table with multiple GROUP in SQL

I've been playing with sets in SQL Server 2000 and have the following table structure for one of my temp tables (#Periods):

    RestCTR     HoursCTR    Duration    Rest
    ----------------------------------------
    1           337         2           0
    2           337         46          1
    3           337         2           0
    4           337         46          1
    5           338         1           0
    6           338         46          1
    7           338         2           0
    8           338         46          1
    9           338         1           0
    10          339         46          1
    ...

What I'd like to do is to calculate the Sum of the 2 longest Rest periods for each HoursCTR, preferably using sets and temp tables (rather than cursors, or nested subqueries).

Here's the dream query that just won't work in SQL (no matter how many times I run it):

Select HoursCTR, SUM ( TOP 2 Duration ) as LongestBreaks
FROM #Periods
WHERE Rest = 1
Group By HoursCTR    

The HoursCTR can have any number of Rest periods (including none).

My current solution is not very elegant and basically involves the following steps:

  1. Get the max duration of rest, group by HoursCTR
  2. Select the first (min) RestCTR row that returns this max duration for each HoursCTR
  3. Repeat step 1 (excluding the rows already collected in step 2)
  4. Repeat step 2 (again, excluding rows collected in step 2)
  5. Combine the RestCTR rows (from step 2 and 4) into single table
  6. Get SUM of the Duration pointed to by the rows in step 5, grouped by HoursCTR

If there are any set functions that cut this process down, they would be very welcome.

like image 507
alex.zambila Avatar asked Sep 29 '10 16:09

alex.zambila


1 Answers

The best way to do this in SQL Server is with a common table expression, numbering the rows in each group with the windowing function ROW_NUMBER():

WITH NumberedPeriods AS (
  SELECT HoursCTR, Duration, ROW_NUMBER() 
    OVER (PARTITION BY HoursCTR ORDER BY Duration DESC) AS RN
  FROM #Periods
  WHERE Rest = 1
)
SELECT HoursCTR, SUM(Duration) AS LongestBreaks
FROM NumberedPeriods
WHERE RN <= 2
GROUP BY HoursCTR

edit: I've added an ORDER BY clause in the partitioning, to get the two longest rests.


Mea culpa, I did not notice that you need this to work in Microsoft SQL Server 2000. That version doesn't support CTE's or windowing functions. I'll leave the answer above in case it helps someone else.

In SQL Server 2000, the common advice is to use a correlated subquery:

SELECT p1.HoursCTR, (SELECT SUM(t.Duration) FROM 
    (SELECT TOP 2 p2.Duration FROM #Periods AS p2
     WHERE p2.HoursCTR = p1.HoursCTR 
     ORDER BY p2.Duration DESC) AS t) AS LongestBreaks
FROM #Periods AS p1
like image 165
Bill Karwin Avatar answered Nov 15 '22 20:11

Bill Karwin