Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle SQL: How to SELECT N records for each "group" / "cluster"

Tags:

oracle

I've got a table big_table, with 4 million record, they are clustered in 40 groups through a column called "process_type_cod". The list of values that this column may assume is in a second table. Let's call it small_table.

So, we have big_table with a NOT NULL FK called process_type_cod that points to small_table (assume the colum name is the same on both tables).

I want N record (i.e. 10) from big_table, for each record of the small_table.

I.e. 10 record from big_table related to the first record of small_table UNION 10 different record from big_table related to the second record of small table, and so on.

Is it possible to obtain with a single SQL function?

like image 845
Revious Avatar asked Jul 13 '11 21:07

Revious


1 Answers

I recommend an analytical function such as rank() or row_number(). You could do this with hard-coded unions, but the analytical function does all the hard work for you.

select *
from 
(
    select
      bt.col_a,
      bt.col_b,
      bt.process_type_cod,
      row_number() over ( partition by process_type_cod order by col_a nulls last ) rank
    from small_table st
    inner join big_table bt
      on st.process_type_cod = bt.process_type_cod
)
where rank < 11
;

You may not even need that join since big_table has all of the types you care about. In that case, just change the 'from clause' to use big_table and drop the join.

What this does is performs the query and then sorts the records using the 'order by' operator in the partition statement. For a given group (here we grouped by col_a), a numerical row number (i.e. 1, 2, 3, 4, 5, n+1...) is applied to each record consecutively. In the outer where clause, just filter by the records with a number lower than N.

like image 119
Jordan Parmer Avatar answered Oct 17 '22 16:10

Jordan Parmer