I am trying to get a count of non-null values for each column in my table. I have researched the following previously asked SO questions, and have not found a satisfactory answer:
Query to list number of records in each table in a database
get a count of each value from every column in a table SQL Server
SQL Server count number of distinct values in each column of a table
I have written the following code to try to build out a data dictionary for my table to include the name of the column, the number of populated rows in each column, the data type, the length, and whether it is a primary key:
SELECT
c.name 'Column Name',
p.rows 'Row_Count',
t.Name 'Data type',
c.max_length 'Max Length',
ISNULL(i.is_primary_key, 0) 'Primary Key'
FROM
sys.columns c
INNER JOIN
sys.types t ON c.user_type_id = t.user_type_id
LEFT OUTER JOIN
sys.index_columns ic ON ic.object_id = c.object_id AND ic.column_id = c.column_id
LEFT OUTER JOIN
sys.indexes i ON ic.object_id = i.object_id AND ic.index_id = i.index_id
LEFT OUTER JOIN
sys.partitions p ON p.OBJECT_ID = i.OBJECT_ID and i.index_id = p.index_id
WHERE
c.object_id = OBJECT_ID('my_table')
However, the Row_Count column returns all Nulls.
My expected outcome looks like this:
Column_Name Row_Count Data_Type Max_Length Primary_Key
A 10 varchar 50 0
B 10 varchar 50 0
C 7 float 50 0
D 3 float 50 0
E 10 varchar 50 0
Here is one option that does NOT use Dynamic SQL.
Full disclosure, I suspect DS would be more performant. That said, this would work on any virtually any table, view or query. I'm using master..spt_values
as a demonstration
Example
Select ColumnName = Item
,B.column_ordinal
,Row_Count = sum(1)
,B.system_type_name
,B.max_length
,Distinct_Values = count(DISTINCT Value)
From (
Select C.*
From master..spt_values A
Cross Apply ( values (cast((Select A.* for XML RAW) as xml))) B(XMLData)
Cross Apply (
Select Item = replace(xAttr.value('local-name(.)', 'varchar(100)'),'_x0020_',' ')
,Value = xAttr.value('.','varchar(max)')
From XMLData.nodes('//@*') xNode(xAttr)
) C
) A
Left Join (
Select * from sys.dm_exec_describe_first_result_set('Select * from master..spt_values',null,null )
) B on A.Item=B.name
Group By A.Item
,B.system_type_name
,B.max_length
,B.column_ordinal
Order By B.column_ordinal
Returns
EDIT
As Larnu mentioned this will fail with (var)binary and image. Again, this will NOT perform well on a large table. I've only used such approaches during my discovery phase.
God this is ugly...
DECLARE @SQL nvarchar(MAX);
DECLARE @Table sysname = 'SampleTable';
DECLARE @Schema sysname = 'dbo';
SET @SQL = N'WITH Counts AS (' + NCHAR(13) + NCHAR(10) +
N' SELECT @Schema AS SchemaName,' + NCHAR(13) + NCHAR(10) +
N' @Table AS TableName,' +
STUFF((SELECT N',' + NCHAR(13) + NCHAR(10) +
N' COUNT(' + QUOTENAME(C.COLUMN_NAME) + N') AS ' + QUOTENAME(COLUMN_NAME)
FROM INFORMATION_SCHEMA.COLUMNS C
WHERE C.TABLE_SCHEMA = @Schema
AND C.TABLE_NAME = @Table
FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,14,N'') + NCHAR(13) + NCHAR(10) +
N' FROM ' + QUOTENAME(@Table) + N')' + NCHAR(13) + NCHAR(10) +
N'SELECT V.ColumnName,' + NCHAR(13) + NCHAR(10) +
N' V.NonNullCount,' + NCHAR(13) + NCHAR(10) +
N' ISC.DATA_TYPE + ISNULL(DT.S,'''') AS Datatype,' + NCHAR(13) + NCHAR(10) +
N' ISNULL(PK.IsPrimaryKey,''No'') AS PrimaryKey' + NCHAR(13) + NCHAR(10) +
N'FROM Counts C' + NCHAR(13) + NCHAR(10) +
N' CROSS APPLY(VALUES' + STUFF((SELECT N',' + NCHAR(13) + NCHAR(10) +
N' (N' + QUOTENAME(C.COLUMN_NAME,'''') + N',C.' + QUOTENAME(C.COLUMN_NAME) + N')'
FROM INFORMATION_SCHEMA.COLUMNS C
WHERE C.TABLE_NAME = @Table
FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,26,N'') + N')V(ColumnName,NonNullCount)' + NCHAR(13) + NCHAR(10) +
N' JOIN INFORMATION_SCHEMA.COLUMNS ISC ON C.SchemaName = ISC.TABLE_SCHEMA' + NCHAR(13) + NCHAR(10) +
N' AND C.TableName = ISC.TABLE_NAME' + NCHAR(13) + NCHAR(10) +
N' AND V.ColumnName = ISC.COLUMN_NAME' + NCHAR(13) + NCHAR(10) +
N' CROSS APPLY (VALUES(''('' + STUFF(CONCAT('','' + CASE ISC.CHARACTER_MAXIMUM_LENGTH WHEN -1 THEN ''MAX'' ELSE CONVERT(varchar(4),ISC.CHARACTER_MAXIMUM_LENGTH) END,' + NCHAR(13) + NCHAR(10)+
N' '','' + CASE WHEN ISC.DATA_TYPE NOT LIKE ''%int'' THEN CONVERT(varchar(4),ISC.NUMERIC_PRECISION) END,' + NCHAR(13) + NCHAR(10) +
N' '','' + CASE WHEN ISC.DATA_TYPE NOT LIKE ''%int'' THEN CONVERT(varchar(4),ISC.NUMERIC_SCALE) END,' + NCHAR(13) + NCHAR(10) +
N' '','' + CONVERT(varchar(4),ISC.DATETIME_PRECISION)),1,1,'''') + '')'')) DT(S)' + NCHAR(13) + NCHAR(10) +
N' OUTER APPLY(SELECT ''Yes'' AS IsPrimaryKey ' + NCHAR(13) + NCHAR(10) +
N' FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS TC' + NCHAR(13) + NCHAR(10) +
N' JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU ON TC.TABLE_SCHEMA = KCU.TABLE_SCHEMA' + NCHAR(13) + NCHAR(10) +
N' AND TC.TABLE_NAME = KCU.TABLE_NAME' + NCHAR(13) + NCHAR(10) +
N' AND TC.CONSTRAINT_NAME = KCU.CONSTRAINT_NAME' + NCHAR(13) + NCHAR(10) +
N' WHERE TC.CONSTRAINT_TYPE = ''PRIMARY KEY''' + NCHAR(13) + NCHAR(10) +
N' AND KCU.COLUMN_NAME = V.ColumnName' + NCHAR(13) + NCHAR(10) +
N' AND TC.TABLE_SCHEMA = ISC.TABLE_SCHEMA' + NCHAR(13) + NCHAR(10) +
N' AND TC.TABLE_NAME = ISC.TABLE_NAME) PK;';
PRINT @SQL; --Might need to use SELECT here
--SELECT @SQL;
EXEC sp_executesql @SQL, N'@Schema sysname,@Table sysname',@Schema = @Schema, @Table = @Table;
db<>fiddle
Honestly, there is a lot going on here. If it needs an explanation, I will try, but it will take some time, so (and no offence), i'm not going to put the effort in if no one is going to want/need to know.
One note, I have been slightly lazy and no joined on SCHEMA_NAME
. If you are using multiple schemas, with same named objects, this will have problem, and will need to be addressed.
Edit: apparently I'm a glutton for punishment. Fixed the Schema "problem" and added some logic in regards to ints
New Fiddle
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With