I have a StudentScores table as listed below, in SQL Server 2012. The grading system is weighted using special rules. For each MATHS result of the student, there will be one row in the result set. The row may or may not have scores for SCIENCE and LITERATURE columns based on whether there is a score available "within two months of the MATHS result date for SCIENCE" and "within one month of the MATHS result date for LITERATURE".
Note: This is a scenario I created to simplify my actual business domain problem.
I created following query with sub-queries. Is there a way to rewrite it without subqueries and more efficiently?
TABLE
DECLARE @StudentScores TABLE (StudentMarkID INT IDENTITY(1,1) NOT NULL, StudentID INT, SubjectCode VARCHAR(10), ResultDate DATETIME, Score DECIMAL(5,2))
INSERT INTO @StudentScores (StudentID,SubjectCode,ResultDate,Score)
SELECT 1, 'MATHS','2016-01-10',35
UNION ALL
SELECT 1, 'LITERATURE','2016-01-10',62
UNION ALL
SELECT 1, 'SCIENCE','2016-01-30',65
UNION ALL
SELECT 1, 'SCIENCE','2016-02-02',61
UNION ALL
SELECT 1, 'LITERATURE','2016-02-03',60
UNION ALL
SELECT 1, 'MATHS','2016-03-25',55
UNION ALL
SELECT 2, 'LITERATURE','2016-01-10',12
UNION ALL
SELECT 2, 'SCIENCE','2016-01-30',14
UNION ALL
SELECT 2, 'SCIENCE','2016-02-14',12
UNION ALL
SELECT 2, 'LITERATURE','2016-02-14',15
UNION ALL
SELECT 2, 'MATHS','2016-03-25',18
QUERY
SELECT SS.StudentID, Score AS MathsScore,
ResultDate AS MathsResultDate,
(SELECT TOP 1 Score
FROM @StudentScores S2
WHERE S2.StudentID = SS.StudentID
AND S2.SubjectCode = 'SCIENCE'
AND S2.ResultDate >= DATEADD(MONTH,-2,SS.ResultDate)
ORDER BY s2.ResultDate DESC
) AS ScienceScore,
(SELECT TOP 1 ResultDate
FROM @StudentScores S2
WHERE S2.StudentID = SS.StudentID
AND S2.SubjectCode = 'SCIENCE'
AND S2.ResultDate >= DATEADD(MONTH,-2,SS.ResultDate)
ORDER BY s2.ResultDate DESC
) AS ScienceResultDate,
(SELECT TOP 1 Score
FROM @StudentScores S2
WHERE S2.StudentID = SS.StudentID
AND S2.SubjectCode = 'LITERATURE'
AND S2.ResultDate >= DATEADD(MONTH,-1,SS.ResultDate)
ORDER BY s2.ResultDate DESC
) AS LiteratureScore,
(SELECT TOP 1 ResultDate
FROM @StudentScores S2
WHERE S2.StudentID = SS.StudentID
AND S2.SubjectCode = 'LITERATURE'
AND S2.ResultDate >= DATEADD(MONTH,-1,SS.ResultDate)
ORDER BY s2.ResultDate DESC
) AS LiteratureResultDate
FROM @StudentScores SS
WHERE SS.SubjectCode = 'MATHS'
Expected Result
I have managed to reduce the query to two calls to the data table - one for getting the Maths
details as their dates are used to extract the details for the other subjects and second for the other subjects:
WITH DataSource_Maths AS
(
SELECT SS.[StudentID]
,SS.[Score] AS [MathsScore]
,SS.[ResultDate] AS [MathsResultDate]
-- we are using this interal ID later in the final join between the two CTEs
-- in order to know which record, for which date period refers
,ROW_NUMBER() OVER(ORDER BY SS.[StudentID], SS.[ResultDate]) AS InternalID
FROM @StudentScores SS
WHERE SS.[SubjectCode] = 'MATHS'
),
DataSource_Others AS
(
SELECT DS.[StudentID]
,DS.[SubjectCode]
,DS.[Score]
,DS.[ResultDate]
,Ds.[RowID]
,SS.[InternalID]
FROM DataSource_Maths SS
OUTER APPLY
(
SELECT *
-- calculating row ID for each record across student and subject (we are going to take only the latest ones)
-- this is achived using TOP in your example
,DENSE_RANK() OVER (PARTITION BY [StudentID], [SubjectCode] ORDER BY [ResultDate] DESC) AS [RowID]
FROM @StudentScores
WHERE
(
[ResultDate] >= DATEADD(MONTH, -2, SS.[MathsResultDate]) AND [SubjectCode] = 'SCIENCE'
OR
[ResultDate] >= DATEADD(MONTH, -1, SS.[MathsResultDate]) AND [SubjectCode] = 'LITERATURE'
) AND [StudentID] = SS.[StudentID]
) DS
)
SELECT FDS_M.[StudentID]
,FDS_M.[MathsScore] AS [MathsScore]
,FDS_M.[MathsResultDate] AS [MathsResultDate]
,FDS_S.[Score] AS [ScienceScore]
,FDS_S.[ResultDate] AS [ScienceResultDate]
,FDS_L.[Score] AS [LiteratureScore]
,FDS_L.[ResultDate] AS [LiteratureResultDate]
FROM DataSource_Maths FDS_M
LEFT JOIN DataSource_Others FDS_S
ON FDS_M.[InternalID] = FDS_S.[InternalID]
AND FDS_S.[SubjectCode] = 'SCIENCE'
AND FDS_S.[RowID] = 1
LEFT JOIN DataSource_Others FDS_L
ON FDS_M.[InternalID] = FDS_L.[InternalID]
AND FDS_L.[SubjectCode] = 'LITERATURE'
AND FDS_L.[RowID] = 1;
Of course in your more complex example you can materialized the CTE
clauses in temporary tables (for example) in order to simplify and optimize the query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With