I am doing a task for my work and unfortunately the tables are designed horribly and I can't do too much as far as modifying the structure (one of our major programs here has been based off these outdated tables for years). That being said, I need to find a way to use SQL Server (TSQL) to count the number of distinct columns in a given row.
I have a table with columns name
, fieldA
, fieldB
, fieldC
, fieldD
, etc
I want to return a table with each of the rows, returning name
and the number of distinct columns for fieldA
- fieldD
Would return
One way to do this is by unpivoting the data and then subsequently grouping it again. This gives you the ability to use count(distinct)
:
select name, count(distinct val)
from t
unpivot (val for col in (FieldA, FieldB, FieldC, FieldD)) unpvt
group by name;
However, the most efficient way to do this is to keep all the processing on a single row -- no join
s or group by
s or union
s. Assuming none of the values are NULL
, the following works for 4 columns:
select name,
4 - ((case when FieldA in (FieldB, FieldC, FieldD) then 1 else 0 end) +
(case when FieldB in (FieldC, FieldD) then 1 else 0 end) +
(case when FieldC in (FieldD) then 1 else 0 end)
)
from t;
That is, start with the total count of columns. Then subtract 1 each time a column is like a column that comes later. This last condition ensures that duplicates are not counted more than once.
DECLARE @x TABLE
(
Name VARCHAR(32),
A VARCHAR(32),
B VARCHAR(32),
C VARCHAR(32),
D VARCHAR(32)
);
INSERT @x VALUES
('Jim', 1,3,4,6),
('John',1,1,1,2),
('Jane',2,2,3,3);
SELECT Name, NumCols = COUNT(DISTINCT x)
FROM @x AS x
UNPIVOT (x FOR y IN (A,B,C,D)) AS up
GROUP BY Name
ORDER BY NumCols DESC;
You can use UNPIVOT
operator to turn columns to rows and then count distinct values:
select c_name
, count(distinct val) as distinct_vals
, count(val) as total_vals
from ( select c_name
, val
from t1
unpivot(
val for col in(col1, col2, col3, col4)
) unpvt
) s
group by c_name
result:
C_NAME DISTINCT_VALS TOTAL_VALS
Jane 2 4
Jim 4 4
John 2 4
SQLFIDDLE Demo
Yet another method:
SELECT
Name,
Cnt = (SELECT COUNT(DISTINCT V) FROM (VALUES (A), (B), (C), (D)) AS v (V))
FROM atable
;
That uses SQL Server 2008 syntax (the VALUES
row constructor). It can be adapted for earlier versions by replacing the VALUES (A), (B), ...
with SELECT A UNION ALL SELECT B UNION ALL ...
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With