Oracle Analytic function for min value in grouping

Tags:

I'm new to working with analytic functions.

DEPT EMP   SALARY
---- ----- ------
  10 MARY  100000
  10 JOHN  200000
  10 SCOTT 300000
  20 BOB   100000
  20 BETTY 200000
  30 ALAN  100000
  30 TOM   200000
  30 JEFF  300000

I want the department and employee with minimum salary.

Results should look like:

DEPT EMP   SALARY
---- ----- ------
  10 MARY  100000
  20 BOB   100000
  30 ALAN  100000

EDIT: Here's the SQL I have (but of course, it doesn't work as it wants staff in the group by clause as well):

SELECT dept, 
  emp,
  MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY salary)
FROM mytable
GROUP BY dept

441

asked Oct 07 '09 18:10

Travis Heseman

2 Answers

I think that the Rank() function is not the way to go with this, for two reasons.

Firstly, it is probably less efficient than a Min()-based method.

The reason for this is that the query has to maintain an ordered list of all salaries per department as it scans the data, and the rank will then be assigned later by re-reading this list. Obviously in the absence of indexes that can be leveraged for this, you cannot assign a rank until the last data item has been read, and maintenance of the list is expensive.

So the performance of the Rank() function is dependent on the total number of elements to be scanned, and if the number is sufficient that the sort spills to disk then performance will collapse.

This is probably more efficient:

select dept,
       emp,
       salary
from
       (
       SELECT dept, 
              emp,
              salary,
              Min(salary) Over (Partition By dept) min_salary
       FROM   mytable
       )
where salary = min_salary
/

This method only requires that the query maintain a single value per department of the minimum value encountered so far. If a new minimum is encountered then the existing value is modified, otherwise the new value is discarded. The total number of elements that have to be held in memory is related to the number of departments, not the number of rows scanned.

It could be that Oracle has a code path to recognise that the Rank does not really need to be computed in this case, but I wouldn't bet on it.

The second reason for disliking Rank() is that it just answers the wrong question. The question is not "Which records have the salary that is the first ranking when the salaries per department are ascending ordered", it is "Which records have the salary that is the minimum per department". That makes a big difference to me, at least.

160

answered Sep 19 '22 14:09

David Aldridge

I think you were pretty close with your original query. The following would run and do match your test case:

SELECT dept, 
  MIN(emp) KEEP(DENSE_RANK FIRST ORDER BY salary, ROWID) AS emp,
  MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY salary, ROWID) AS salary
FROM mytable
GROUP BY dept

In contrast to the RANK() solutions, this one guarantees at most one row per department. But that hints at a problem: what happens in a department where there are two employees on the lowest salary? The RANK() solutions will return both employees -- more than one row for the department. This answer will pick one arbitrarily and make sure there's only one for the department.

answered Sep 19 '22 14:09

William Rose

Related questions
                            
                                How to convert an Epoch timestamp to a Date in Standard SQL
                            
                                How to create tables with N:M relationship in MySQL?
                            
                                Split comma separated string into rows in mysql
                            
                                Two foreign keys, one of them not NULL: How to solve this in SQL?
                            
                                Does Apache Spark SQL support MERGE clause?
                            
                                Cumulative sum in Spark
                            
                                Can blockchain be stored in SQL or even noSQL database?
                            
                                Db2: How to update the current value of a sequence
                            
                                How to run a Hibernate NativeQuery in a type-safe manner instead of returning an Object[]
                            
                                How to concatenate arrays grouped by another column in Presto?
                            
                                Unnesting Multiple Nested Fields Deep in BigQuery
                            
                                Why "SELECT ... WHERE id=1=0" returns all rows except one with id=1?
                            
                                generate_series() equivalent in snowflake
                            
                                Is there any reason to use string.Format() with just a string parameter?
                            
                                SQL asterisk(*) all possible uses
                            
                                Best way of constructing dynamic sql queries in C#/.NET3.5?
                            
                                Default size for database fields
                            
                                Best way to check whether a row has been updated in SQL
                            
                                ExecuteReader requires an open and available Connection. The connection's current state is closed
                            
                                SQL query to return one single record for each unique value in a column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Oracle Analytic function for min value in grouping

Tags:

sql

oracle

top-n

analytic-functions

Travis Heseman

People also ask

2 Answers

David Aldridge

William Rose

Recent Activity

Donate For Us