I ran into an interesting SQL problem today and while I came up with a solution that works I doubt it's the best or most efficient answer. I defer to the experts here - help me learn something and improve my query! RDBMS is SQL Server 2008 R2, query is part of an SSRS report that will run against about 100,000 rows.
Essentially I have a list of IDs that could have multiple values associated with them, the values being Yes, No, or some other string. For ID x, if any of the values are a Yes, x should be Yes, if they are all No, it should be No, if they contain any other values but yes and no, display that value. I only want to return 1 row per ID, no duplicates.
The simplified version and test case:
DECLARE @tempTable table ( ID int, Val varchar(1) )
INSERT INTO @tempTable ( ID, Val ) VALUES ( 10, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 11, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 12, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 13, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 14, 'N')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 15, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 16, 'Y')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 17, 'F')
INSERT INTO @tempTable ( ID, Val ) VALUES ( 18, 'P')
SELECT DISTINCT t.ID, COALESCE(t2.Val, t3.Val, t4.Val)
FROM @tempTable t
LEFT JOIN
(
SELECT ID, Val
FROM @tempTable
WHERE Val = 'Y'
) t2 ON t.ID = t2.ID
LEFT JOIN
(
SELECT
ID, Val FROM @tempTable
WHERE Val = 'N'
) t3 ON t.ID = t3.ID
LEFT JOIN
(
SELECT ID, Val
FROM @tempTable
WHERE Val <> 'Y' AND Val <> 'N'
) t4 ON t.ID = t4.ID
Thanks in advance.
Let's answer an easier problem: for each id, get the Val which is last in the alphabet. This will work if Y and N are the only values. And the query is much simpler:
SELECT t.ID, MAX(t.Val) FROM t GROUP BY t.ID;
So, reduce your case to the simple case. Use an enum (if your DB supports it) or break the value codes into another table with a collation column (in this case, you could have 1 for Y, 2 for N, and 999 for all other possible values, and you want the smallest). Then
SELECT ID, c.Val FROM
(SELECT t.ID, MIN(codes.collation) AS mx
FROM t join codes on t.Val = codes.Val GROUP BY t.ID) AS q
JOIN codes c ON mx=c.collation;
Here codes has two columns, Val and Collation.
You can also do this with a CTE type query, as long as you have the Values ordered as you want them. This approach has one join to a small lookup table and should be much, much faster than 3 self-joins.
WITH q AS (SELECT t.id, t.Val, ROW_NUMBER() AS r FROM t JOIN codes ON t.Val=codes.Val
PARTITION BY t.id ORDER BY codes.collation)
SELECT q.id, q.Val WHERE r=1;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With