Assume we have a table which has two columns, one column contains the names of some people and the other column contains some values related to each person. One person can have more than one value. Each value has a numeric type. The question is we want to select the top 3 values for each person from the table. If one person has less than 3 values, we select all the values for that person.
The issue can be solved if there are no duplicates in the table by the query provided in this article Select top 3 values from each group in a table with SQL . But if there are duplicates, what is the solution?
For example, if for one name John, he has 5 values related to him. They are 20,7,7,7,4. I need to return the name/value pairs as below order by value descending for each name:
-----------+-------+ | name | value | -----------+-------+ | John | 20 | | John | 7 | | John | 7 | -----------+-------+
Only 3 rows should be returned for John even though there are three 7s for John.
In many modern DBMS (e.g. Postgres, Oracle, SQL-Server, DB2 and many others), the following will work just fine. It uses CTEs and ranking function ROW_NUMBER()
which is part of the latest SQL standard:
WITH cte AS ( SELECT name, value, ROW_NUMBER() OVER (PARTITION BY name ORDER BY value DESC ) AS rn FROM t ) SELECT name, value, rn FROM cte WHERE rn <= 3 ORDER BY name, rn ;
Without CTE, only ROW_NUMBER()
:
SELECT name, value, rn FROM ( SELECT name, value, ROW_NUMBER() OVER (PARTITION BY name ORDER BY value DESC ) AS rn FROM t ) tmp WHERE rn <= 3 ORDER BY name, rn ;
Tested in:
In MySQL and other DBMS that do not have ranking functions, one has to use either derived tables, correlated subqueries or self-joins with GROUP BY
.
The (tid)
is assumed to be the primary key of the table:
SELECT t.tid, t.name, t.value, -- self join and GROUP BY COUNT(*) AS rn FROM t JOIN t AS t2 ON t2.name = t.name AND ( t2.value > t.value OR t2.value = t.value AND t2.tid <= t.tid ) GROUP BY t.tid, t.name, t.value HAVING COUNT(*) <= 3 ORDER BY name, rn ; SELECT t.tid, t.name, t.value, rn FROM ( SELECT t.tid, t.name, t.value, ( SELECT COUNT(*) -- inline, correlated subquery FROM t AS t2 WHERE t2.name = t.name AND ( t2.value > t.value OR t2.value = t.value AND t2.tid <= t.tid ) ) AS rn FROM t ) AS t WHERE rn <= 3 ORDER BY name, rn ;
Tested in MySQL
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With