Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I improve this SQL query?

I ran into an interesting SQL problem today and while I came up with a solution that works I doubt it's the best or most efficient answer. I defer to the experts here - help me learn something and improve my query! RDBMS is SQL Server 2008 R2, query is part of an SSRS report that will run against about 100,000 rows.

Essentially I have a list of IDs that could have multiple values associated with them, the values being Yes, No, or some other string. For ID x, if any of the values are a Yes, x should be Yes, if they are all No, it should be No, if they contain any other values but yes and no, display that value. I only want to return 1 row per ID, no duplicates.

The simplified version and test case:

DECLARE @tempTable table ( ID int, Val varchar(1) )

INSERT INTO @tempTable ( ID, Val ) VALUES ( 10, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 13, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 15, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 16, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 17, 'F')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 18, 'P')


SELECT DISTINCT t.ID, COALESCE(t2.Val, t3.Val, t4.Val)
FROM @tempTable t
LEFT JOIN
(
    SELECT ID, Val
    FROM @tempTable
    WHERE Val = 'Y'
) t2 ON t.ID = t2.ID
LEFT JOIN
(
    SELECT 
    ID, Val FROM @tempTable
    WHERE Val = 'N'
) t3 ON t.ID = t3.ID
LEFT JOIN
(
    SELECT ID, Val
    FROM @tempTable
    WHERE Val <> 'Y' AND Val <> 'N'
) t4 ON t.ID = t4.ID

Thanks in advance.

like image 909
eSamuel Avatar asked Dec 17 '22 10:12

eSamuel


1 Answers

Let's answer an easier problem: for each id, get the Val which is last in the alphabet. This will work if Y and N are the only values. And the query is much simpler:

SELECT t.ID, MAX(t.Val) FROM t GROUP BY t.ID;

So, reduce your case to the simple case. Use an enum (if your DB supports it) or break the value codes into another table with a collation column (in this case, you could have 1 for Y, 2 for N, and 999 for all other possible values, and you want the smallest). Then

SELECT ID, c.Val FROM
     (SELECT t.ID, MIN(codes.collation) AS mx
      FROM t join codes on t.Val = codes.Val GROUP BY t.ID) AS q
JOIN codes c ON mx=c.collation;

Here codes has two columns, Val and Collation.

You can also do this with a CTE type query, as long as you have the Values ordered as you want them. This approach has one join to a small lookup table and should be much, much faster than 3 self-joins.

WITH q AS (SELECT t.id, t.Val, ROW_NUMBER() AS r FROM t JOIN codes ON t.Val=codes.Val 
    PARTITION BY t.id ORDER BY codes.collation)
SELECT q.id, q.Val WHERE r=1;            
like image 163
Andrew Lazarus Avatar answered Jan 07 '23 22:01

Andrew Lazarus