How do you write a query where only a select number of rows are returned with either the highest or lowest column value.
i.e. A report with the 5 highest salaried employees?
SELECT MAX (column_name) FROM table_name WHERE column_name NOT IN (SELECT Max (column_name) FROM table_name); First we selected the max from that column in the table then we searched for the max value again in that column with excluding the max value which has already been found, so it results in the 2nd maximum value.
The best way to do this is with analytic functions, RANK() or DENSE_RANK() ...
SQL> select * from (
2 select empno
3 , sal
4 , rank() over (order by sal desc) as rnk
5 from emp)
6 where rnk <= 5
7 /
EMPNO SAL RNK
---------- ---------- ----------
7839 5000 1
7788 3000 2
7902 3000 2
7566 2975 4
8083 2850 5
7698 2850 5
6 rows selected.
SQL>
DENSE_RANK() compresses the gaps when there is a tie:
SQL> select * from (
2 select empno
3 , sal
4 , dense_rank() over (order by sal desc) as rnk
5 from emp)
6 where rnk <= 5
7 /
EMPNO SAL RNK
---------- ---------- ----------
7839 5000 1
7788 3000 2
7902 3000 2
7566 2975 3
8083 2850 4
7698 2850 4
8070 2500 5
7 rows selected.
SQL>
Which behaviour you prefer depends upon your business requirements.
There is also the ROW_NUMBER() analytic function which we can use to return a precise number of rows. However, we should avoid using solutions based on row number unless the business logic is happy to arbitrarily truncate the result set in the event of a tie. There is a difference between asking for the five highest values and the first five records sorted by high values
There is also a non-analytic solution using the ROWNUM pseudo-column. This is clunky because ROWNUM is applied before the ORDER BY clause, which can lead to unexpected results. There is rarely any reason to use ROWNUM instead of ROW_NUMBER() or one of the ranking functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With