Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count number of NULL values in a row

I want to count the number of NULL values in a row.

For example:

Name    Col1    Col2      Col3     Col4 | ansCol
abc     null   a@c.com    null     null | 3
bbc     null    null      null     null | 4

In this example the answer is simple:

SELECT *,
(CASE WHEN Name IS NULL THEN 1 ELSE 0 END +
CASE WHEN Col1 IS NULL THEN 1 ELSE 0 END +       
CASE WHEN Col2 IS NULL THEN 1 ELSE 0 END +       
CASE WHEN Col3 IS NULL THEN 1 ELSE 0 END +       
CASE WHEN Col4 IS NULL THEN 1 ELSE 0 END) as ansCol

The question becomes more complex when we have a very wide table, more than 10 columns. If I try to use the method that was shown I get the following error:

Msg 125, Level 15, State 4, Line 13 Case expressions may only be nested to level 10.

Is there any way to overcome this?

like image 856
Elad L. Avatar asked Dec 04 '16 08:12

Elad L.


Video Answer


3 Answers

The code in your question doesn't have any nested case so you should be able to chain together more than 10 with addition just fine.

But some other possibilities are...

If all the columns have the same datatype you could use (Demo)

SELECT *,
       ansCol = (SELECT COUNT(*) - COUNT(C) FROM (VALUES(c1),(c2),(c3),(c4)) V(C))
FROM   t1 

Or another alternative that doesn't rely on them having the same datatype.

;WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema-instance' as ns)
SELECT *, 
        ansCol = (SELECT x.* FOR XML PATH('row'), 
                    ELEMENTS XSINIL, TYPE).value('count(/row/*[@ns:nil])', 'int') 
FROM   t1 x
like image 162
Martin Smith Avatar answered Nov 10 '22 01:11

Martin Smith


In addition to above, a potential approach would be to use PIVOT and UNPIVOT functionality.

Let's first create the table and insert the data per the question:

/* Create table */
CREATE TABLE so_40957006 (
    [name]  VARCHAR(24),
    [col1]  VARCHAR(24),
    [col2]  VARCHAR(24),
    [col3]  VARCHAR(24),
    [col4]  VARCHAR(24)
)

/* Insert data */
INSERT INTO so_40957006 VALUES('abc', null, 'a@c.com', null, null)
INSERT INTO so_40957006 VALUES('bbc', null, null, null, null)

What we will first do is create an identity table so this way we can uniquely identify each row of data and insert the data.

/* Create identity table */
CREATE TABLE so_40957006_id (
    [lid]   INT IDENTITY(1, 1),
    [name]  VARCHAR(24),
    [col1]  VARCHAR(24),
    [col2]  VARCHAR(24),
    [col3]  VARCHAR(24),
    [col4]  VARCHAR(24)
)

/* Insert data into table */
INSERT INTO so_40957006_id 
SELECT 
  ISNULL([name], 1) AS [name]
 ,ISNULL([col1], 1) AS [col1]
 ,ISNULL([col2], 1) AS [col2]
 ,ISNULL([col3], 1) AS [col3]
 ,ISNULL([col4], 1) AS [col4]
FROM so_40957006

Note that in the process we have converted any of the columns that are null to a value of 1 by using IsNull. The table now looks like below.

lid name    col1    col2    col3    col4
1   abc     1       a@c.com 1       1
2   bbc     1       1       1       1

With this, we can now use PIVOT and UNPIVOT to verticalize the data. For example, running the query below

SELECT [lid], [cols], [val]
FROM
    (SELECT [lid], [name], [col1], [col2], [col3], [col4]
        FROM so_40957006_id) p
UNPIVOT
    (val FOR cols IN 
    ([name], [col1], [col2], [col3], [col4])
) AS unpvt

provides you with the vertical output of:

lid cols    val
1   name    abc
1   col1    1
1   col2    a@c.com
1   col3    1
1   col4    1
2   name    bbc
2   col1    1
2   col2    1
2   col3    1
2   col4    1

Since the values you want to count (i.e. the NULLs) have a value of 1, you can just run a GROUP BY + SUM statement:

SELECT [lid], SUM(cast([val] as integer)) as CountNulls
  FROM (
    SELECT [lid], [cols], [val]
    FROM
        (SELECT [lid], [name], [col1], [col2], [col3], [col4]
           FROM so_40957006_id) p
    UNPIVOT
     (val FOR cols IN 
       ([name], [col1], [col2], [col3], [col4])
    ) AS unpvt
) a 
WHERE ISNUMERIC([val]) = 1
GROUP BY [lid]

with the output

lid CountNulls
1   3
2   4

With the [lid] you can JOIN this output back to the original table to generate your result set.

like image 37
Denny Lee Avatar answered Nov 10 '22 00:11

Denny Lee


This feels like a hack, and perhaps you should change your design or reduce the number of columns in your table, but if you must do this one option would be to use a subquery to get around the 10 column nested limit:

SELECT t.*,
       t.firstBatch + t.secondBatch AS ansCol
FROM
(
    SELECT *,
        (CASE WHEN Col1 IS NULL THEN 1 ELSE 0 END +  
         CASE WHEN Col2 IS NULL THEN 1 ELSE 0 END +       
         ...
         CASE WHEN Col10 IS NULL THEN 1 ELSE 0 END) AS firstBatch,
        (CASE WHEN Col11 IS NULL THEN 1 ELSE 0 END +  
         CASE WHEN Col12 IS NULL THEN 1 ELSE 0 END +       
         ...
         CASE WHEN Col20 IS NULL THEN 1 ELSE 0 END) AS secondBatch
    FROM yourTable
) t
like image 22
Tim Biegeleisen Avatar answered Nov 10 '22 00:11

Tim Biegeleisen