Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count Number of Distinct Columns in a Row with tSQL

I am doing a task for my work and unfortunately the tables are designed horribly and I can't do too much as far as modifying the structure (one of our major programs here has been based off these outdated tables for years). That being said, I need to find a way to use SQL Server (TSQL) to count the number of distinct columns in a given row.

Example:

I have a table with columns name, fieldA, fieldB, fieldC, fieldD, etc I want to return a table with each of the rows, returning name and the number of distinct columns for fieldA - fieldD

Visually:

  • Jim - 1 - 3 - 4 - 6
  • John - 1 - 1 - 1 - 2
  • Jane 2 - 2 - 3 - 3

Would return

  • Jim - 4
  • John - 2
  • Jane - 2
like image 894
user2066880 Avatar asked Jul 31 '13 17:07

user2066880


4 Answers

One way to do this is by unpivoting the data and then subsequently grouping it again. This gives you the ability to use count(distinct):

select name, count(distinct val)
from t
unpivot (val for col in (FieldA, FieldB, FieldC, FieldD)) unpvt
group by name;

However, the most efficient way to do this is to keep all the processing on a single row -- no joins or group bys or unions. Assuming none of the values are NULL, the following works for 4 columns:

select name,
       4 - ((case when FieldA in (FieldB, FieldC, FieldD) then 1 else 0 end) +
            (case when FieldB in (FieldC, FieldD) then 1 else 0 end) +
            (case when FieldC in (FieldD) then 1 else 0 end)
           )
from t;

That is, start with the total count of columns. Then subtract 1 each time a column is like a column that comes later. This last condition ensures that duplicates are not counted more than once.

like image 90
Gordon Linoff Avatar answered Sep 27 '22 18:09

Gordon Linoff


DECLARE @x TABLE
(
  Name VARCHAR(32),
  A VARCHAR(32),
  B VARCHAR(32),
  C VARCHAR(32),
  D VARCHAR(32)
);

INSERT @x VALUES
('Jim', 1,3,4,6),
('John',1,1,1,2),
('Jane',2,2,3,3);

SELECT Name, NumCols = COUNT(DISTINCT x) 
FROM @x AS x
UNPIVOT (x FOR y IN (A,B,C,D)) AS up
GROUP BY Name
ORDER BY NumCols DESC; 
like image 30
Aaron Bertrand Avatar answered Sep 27 '22 18:09

Aaron Bertrand


You can use UNPIVOT operator to turn columns to rows and then count distinct values:

select c_name
     , count(distinct val) as distinct_vals
     , count(val)          as total_vals
  from ( select c_name
              , val
           from t1
         unpivot(
             val for col in(col1, col2, col3, col4) 
          ) unpvt
        ) s
group by c_name

result:

C_NAME  DISTINCT_VALS   TOTAL_VALS
Jane    2               4
Jim     4               4
John    2               4

SQLFIDDLE Demo

like image 42
Nick Krasnov Avatar answered Sep 27 '22 17:09

Nick Krasnov


Yet another method:

SELECT
   Name,
   Cnt = (SELECT COUNT(DISTINCT V) FROM (VALUES (A), (B), (C), (D)) AS v (V))
FROM atable
;

That uses SQL Server 2008 syntax (the VALUES row constructor). It can be adapted for earlier versions by replacing the VALUES (A), (B), ... with SELECT A UNION ALL SELECT B UNION ALL ....

like image 32
Andriy M Avatar answered Sep 27 '22 18:09

Andriy M