Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting SQL result into groups with max size = n

I have a table

 id | volume_id| ... |
----+----------+-----+
  1 |       1  | ... |
  2 |       2  | ... |
  3 |       1  | ... |
  4 |       3  | ... |
  5 |       2  | ... |
  ...

I can do a simple grouping query:

select volume_id, count(*), min(id) as min_id, max(id) as max_id
from my_table
group by volume_id;

Which will produce result:

 volume_id | count | min_id | max_id    
-----------+-------+--------+--------
         1 | 67330 |  ...   | ...
         2 | 67330 |  ...   | ...
         3 | 67330 |  ...   | ...
         4 | 67330 |  ...   | ...

But I would like to split results into groups of 40K rows. So the results should look like:

 volume_id | count | min_id | max_id    
-----------+-------+--------+--------
         1 | 40000 |  ...   | ...      <- first  group of IDs for volume 1
         1 | 27330 |  ...   | ...      <- second group of IDs for volume 1
         2 | 40000 |  ...   | ...
         2 | 27330 |  ...   | ...
         3 | 40000 |  ...   | ...
         4 | 27330 |  ...   | ...

ID's should be split so that max_id for first group should be smaller than min_id from second group and so on.

If any one has idea how to write such query (or plsql function if there is no other way), I would be grateful.

I am using Postgresql 9.5.

like image 270
Michał Albrycht Avatar asked Mar 14 '23 02:03

Michał Albrycht


1 Answers

You can use rank() (or row_number() if there are no duplicates) to enumerate the groups. Then simple arithmetic in the group by:

select volume_id, count(*), min(id) as min_id, max(id) as max_id
from (select t.*,
             rank() over (partition by volume_id order by id) as seqnum
      from my_table t
     ) t
group by volume_id, floor((seqnum - 1) / 40000)
order by volume_id, min(id);
like image 174
Gordon Linoff Avatar answered Mar 16 '23 00:03

Gordon Linoff