Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare SQL groups against eachother

Tags:

sql

group-by

How can one filter a grouped resultset for only those groups that meet some criterion compared against the other groups? For example, only those groups that have the maximum number of constituent records?

I had thought that a subquery as follows should do the trick:

SELECT * FROM (
    SELECT   *, COUNT(*) AS Records
    FROM     T
    GROUP BY X
) t HAVING Records = MAX(Records);

However the addition of the final HAVING clause results in an empty recordset... what's going on?

like image 654
eggyal Avatar asked Mar 27 '12 13:03

eggyal


People also ask

How do I compare two db schemas?

To compare database definitions. On the Tools menu, select SQL Server, and then click New Schema Comparison. Alternatively, right-click the TradeDev project in Solution Explorer, and select Schema Compare. The Schema Compare window opens, and Visual Studio automatically assigns it a name such as SqlSchemaCompare1 .

How do you compare data between two tables?

Compare two tables by using joins. To compare two tables by using joins, you create a select query that includes both tables. If there is not already an existing relationship between the tables on the fields that contain the corresponding data, you create a join on the fields that you want to examine for matches.

Can we compare two tables in SQL?

Comparing the Results of the Two Queries Let us suppose, we have two tables: table1 and table2. Here, we will use UNION ALL to combine the records based on columns that need to compare. If the values in the columns that need to compare are the same, the COUNT(*) returns 2, otherwise the COUNT(*) returns 1.


1 Answers

In MySQL (Which I assume you are using since you have posted SELECT *, COUNT(*) FROM T GROUP BY X Which would fail in all RDBMS that I know of). You can use:

SELECT  T.*
FROM    T
        INNER JOIN
        (   SELECT  X, COUNT(*) AS Records
            FROM    T
            GROUP BY X
            ORDER BY Records DESC
            LIMIT 1
        ) T2
            ON T2.X = T.X

This has been tested in MySQL and removes the implicit grouping/aggregation.

If you can use windowed functions and one of TOP/LIMIT with Ties or Common Table expressions it becomes even shorter:

Windowed function + CTE: (MS SQL-Server & PostgreSQL Tested)

WITH CTE AS
(   SELECT  *, COUNT(*) OVER(PARTITION BY X) AS Records
    FROM    T
)
SELECT  *
FROM    CTE
WHERE   Records = (SELECT MAX(Records) FROM CTE)

Windowed Function with TOP (MS SQL-Server Tested)

SELECT  TOP 1 WITH TIES *
FROM    (   SELECT  *, COUNT(*) OVER(PARTITION BY X) [Records]
            FROM    T
        )
ORDER BY Records DESC

Lastly, I have never used oracle so apolgies for not adding a solution that works on oracle...


EDIT

My Solution for MySQL did not take into account ties, and my suggestion for a solution to this kind of steps on the toes of what you have said you want to avoid (duplicate subqueries) so I am not sure I can help after all, however just in case it is preferable here is a version that will work as required on your fiddle:

SELECT  T.*
FROM    T
        INNER JOIN
        (   SELECT  X
            FROM    T
            GROUP BY X
            HAVING  COUNT(*) = 
                    (   SELECT  COUNT(*) AS Records
                        FROM    T
                        GROUP BY X
                        ORDER BY Records DESC
                        LIMIT 1
                    )
        ) T2
            ON T2.X = T.X
like image 110
GarethD Avatar answered Sep 30 '22 08:09

GarethD