Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL query to count a column in all tables

So far, I am able to extract the list of database table using the below SQL query:

SELECT
    DISTINCT
    TABLE_SCHEMA,
    TABLE_NAME
FROM
    INFORMATION_SCHEMA.COLUMNS

In each of these table, the first column is named "Year". The values are from year "2011" to year "2017":

CREATE TABLE foo (
    [Year] int,
    AnotherColumn varchar(50),
    ...
)

CREATE TABLE bar (
    [Year] int,
    SomeOtherColumn guid,
    ...
)

CREATE TABLE ...

Now, I would need to count the number of rows of different years in each of the tables, and display the output in the format below:

| TABLE_SCHEMA | TABLE_NAME | 2011                | 2012                | ... | 2017                |
|:-------------|-----------:|:-------------------:|:-------------------:|:----|:-------------------:|
| SCHEMA       | foo        | no. of rows of 2011 | no. of rows of 2012 | ... | no. of rows of 2017 | 
| SCHEMA       | bar        | no. of rows of 2011 | no. of rows of 2012 | ... | no. of rows of 2017 | 
| SCHEMA       | ...        | no. of rows of 2011 | no. of rows of 2012 | ... | no. of rows of 2017 | 

Does anyone have any suggestions? Thanks a lot!

like image 242
Chris Wong Avatar asked Sep 08 '17 02:09

Chris Wong


People also ask

How do I count columns in SQL query?

mysql> SELECT COUNT(*) AS NUMBEROFCOLUMNS FROM INFORMATION_SCHEMA. COLUMNS -> WHERE table_schema = 'business' AND table_name = 'NumberOfColumns'; The output displays the number of columns.

How do I count all tables?

mysql> SELECT SUM(TABLE_ROWS) ->FROM INFORMATION_SCHEMA. TABLES ->WHERE TABLE_SCHEMA = 'business'; The following table returns the count of records. mysql> SELECT table_name, table_rows ->FROM INFORMATION_SCHEMA.

How to count number of rows in a table in SQL?

SQL Query to Count Number of Rows: The SQL COUNT () function is used to return the number of rows in a table. It is used with the Select () statement. Syntax: SELECT COUNT (colmn_name) from table_name;

What is the use of count () function in SQL?

The SQL COUNT () function is used to return the number of rows in a table. It is used with the Select () statement. Syntax: SELECT COUNT (colmn_name) from table_name;

What is the difference between count() and all() in SQL?

The COUNT function returns the number of rows in a group. The ALL keyword includes duplicate values while the DISTINCT keyword removes the duplicate values in the result. The COUNT (*) returns the number of rows in a query including duplicate rows and rows that contain null values.

How do you count unique values in a column in SQL?

Counting Unique Values in a Column Use the COUNT function to identify the number of unique values in a column. In the example, to identify the number of different suppliers whose products appear in the produce department, execute the following query: SELECT COUNT (DISTINCT SupplierID)


1 Answers

While every SQL implementation offers some form of value parameterization, no such facility exists to parameterize object identifiers (e.g. table names, column names, etc) - which means you must resort to Dynamic-SQL, which introduces its own risks (namely SQL injection).

With your specific problem, we can start by trying to solve it without Dynamic-SQL, by assuming a known and fixed set of tables to query, then we can convert it to Dynamic-SQL, hopefully in a safe manner:

SELECT
    'Table1' AS TableName
    [Year],
    COUNT(*) AS YearRowCount
FROM
    Table1
GROUP BY
    [Year]

UNION ALL

SELECT
    'Table2' AS TableName
    [Year],
    COUNT(*) AS YearRowCount
FROM
    Table2
GROUP BY
    [Year]

UNION ALL

...

Hopefully you're seeing a pattern here.

This query, so far, will give us results of this form:

TableName    Year    YearRowCount
'Table1'     2017            1234
'Table1'     2016            2345
'Table1'     2015            3456
'Table1'     2014            1234
'Table1'     2013            1234
'Table1'     2011            1234
'Table2'     2017            1234
'Table2'     2016            2345
'Table2'     2015            3456
'Table2'     2013            1234
'Table2'     2012            1234
'Table2'     2011            1234
...

We can then use PIVOT to transpose the rows into columns. PIVOT (and UNPIVOT) do require you to explicitly name each column to be transposed, unfortunately - but it would be nice if they had a PIVOT ALL feature or something).

SELECT
    tableName,
    YearRowCount,
    [2011], [2012], [2013], [2014], [2015], [2016], [2017]
FROM
(
    -- our UNION query goes here --
)
PIVOT
(
    SUM( YearRowCount )
    FOR [Year] IN ( 2011, 2012, 2013, 2014, 2015, 2016, 2017 )
)

So now we know the pattern of the inner query and the PIVOT statement to surround it, we can make it dynamic.

There are 3 approaches for generating Dynamic SQL on a "for each row..." basis. The first is to use a CURSOR, the second is to use some kind of T-SQL loop (WHILE, etc) - both of these approaches take the iterative approach - but there's a 3rd version which is more functional and syntactically simpler. I'll demonstrate this functional approach.

Also, we can avoid the uglier parts of manual string concatenation by using (abusing) the FORMATMESSAGE function which serves as a sprintf implementation. To use FORMATMESSAGE to format a string requires SQL Server 2016 or later (though Compatibility Level does not need to be 130 as far as I can tell). If you are running an earlier version you'll need to use CONCAT or 'foo' + @var + 'bar'-style concatenation.

I'm also using the COALESCE( [aggregate] + [separator], '' ) + [value] trick described in this answer: https://stackoverflow.com/a/194887/159145 - it's one method of concatenating (aggregating) row values, though it feels a bit ugly. Remember that SQL is primarily concerned with relational algebra of unordered sets of tuple data (i.e. tables), which does not normally cover view-level concerns like ordering or aggregating sorted data - which is what concatenation is.

DECLARE @unionTemplate varchar(1024) = '
SELECT
    ''%s.%s'' AS TableName
    [Year],
    COUNT(*) AS YearRowCount
FROM
    [%s].[%s]
GROUP BY
    [Year]
'

DECLARE @unionSeparator varchar(20) = '
UNION ALL
'

DECLARE @unionQuery varchar(max)

SELECT
    @unionQuery = COALESCE( @unionQuery + @unionSeparator, '' ) + FORMATMESSAGE( @unionTemplate, SCHEMA_NAME, TABLE_NAME, SCHEMA_NAME, TABLE_NAME )
FROM
    INFORMATION_SCHEMA.TABLES
ORDER BY
    SCHEMA_NAME,
    TABLE_NAME

Anyway, this query will generate the query stored in @unionQuery, so now we just need to compose it...

DECLARE @pivotQuery varchar(max) = '
SELECT
    tableName,
    YearRowCount,
    [2011], [2012], [2013], [2014], [2015], [2016], [2017]
FROM
(
    %s
)
PIVOT
(
    SUM( YearRowCount )
    FOR [Year] IN ( 2011, 2012, 2013, 2014, 2015, 2016, 2017 )
)'

SET @pivotQuery = FORMATMESSAGE( @pivotQuery, @unionQuery )

...and execute it (EXEC sp_executesql is preferred over the archaic EXEC()) - also note that EXEC() is not the same thing as EXEC!

EXEC sp_executesql @pivotQuery

ta-da!

Older SQL Server versions (2014, 2012, 2008 R2, 2008):

These are untested, but if you need to run on SQL Server versions older than 2016 (v13.0), try these alternatives to FORMATMESSAGE:

DECLARE @unionQuery nvarchar(max)

SELECT
    @unionQuery =
        COALESCE( @unionQuery + ' UNION ALL ', '' ) +
        CONCAT(
            'SELECT ''',
            SCHEMA_NAME, '.', TABLE_NAME, '[Year],
    COUNT(*) AS YearRowCount
FROM
    [', SCHEMA_NAME, '].[', TABLE_NAME, ']
GROUP BY
    [Year]
'
    )
FROM
    INFORMATION_SCHEMA.TABLES
ORDER BY
    SCHEMA_NAME,
    TABLE_NAME

As the @pivotQuery is only inserted once, it's okay to use REPLACE to insert the inner @unionQuery, but never do this when dealing with user-provided values because you open yourself up to SQL injection-like attacks:

SET @pivotQuery = REPLACE( @pivotQuery, '%s', @unionQuery )
like image 155
Dai Avatar answered Nov 05 '22 15:11

Dai