Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Select statement split into groupings / chunks

Tags:

sql

postgresql

I am trying to do a select statement that is grouped into multiple 'chunks', defined by a user defined number (eg. 5, from lets say 12 rows)

What I would like to achieve, is a select statement that appends a column defining an incrementing group value of the defined chunk size, so for 12 rows, there would be rows 1-5 equal chunk 1, rows 6-10 equal chunk 2, rows 11-12 equal chunk 3.

I have tried searching on a few occassions for this solution with little success, I thought there would be a way to modify a basic ROW_NUMBER() OVER (ORDER BY id), that could reset the count after every 5 rows, and increase the group/chunk.

I realise there are other ways of achieving this result, but I am looking for the fastest simplest/cleanest solution. Idea is to apply this to a function that has to work out best case scenario grouping for batch submissions.

Example of desired output.


| Row no. | Item Id. | Chunk No. |
|--------:|:--------:|:---------:|
|    1    |   1001   |     1     |
|    2    |   1002   |     1     |
|    3    |   1003   |     1     |
|    4    |   1004   |     1     |
|    5    |   1005   |     1     |
|    6    |   1006   |     2     |
|    7    |   1007   |     2     |
|    8    |   1008   |     2     |
|    9    |   1009   |     2     |
|    10   |   1010   |     2     |
|    11   |   1011   |     3     |  
|    11   |   1012   |     3     |

like image 946
wingman Avatar asked Oct 18 '13 11:10

wingman


People also ask

How do you split data into a group in SQL?

The SQL NTILE() is a window function that allows you to break the result set into a specified number of approximately equal groups, or buckets. It assigns each group a bucket number starting from one. For each row in a group, the NTILE() function assigns a bucket number representing the group to which the row belongs.

Can we use select * with group by?

Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause. The original idea was to create the table in beginning of the query, so the (SELECT * FROM #TBL) could be used on the query itself, instead of defining the names on each GROUP BY.


2 Answers

You should just divide the RowNum to 5 for example with /

select ROW_NUMBER() over (ORDER BY id) RowNum,
       id,
       ((ROW_NUMBER() over (ORDER BY id) - 1)  / 5) +1 as ChunkNo  
from t

SQLFiddle demo

like image 200
valex Avatar answered Sep 22 '22 09:09

valex


Just as another solution with custom labels:

SELECT 
ROW_NUMBER() over (ORDER BY ID) AS RowNum,
id,
  CASE
    WHEN ntile(3) over(ORDER BY ID)=1 THEN 'Label 1'
    WHEN ntile(3) over(ORDER BY ID)=2 THEN 'Label 2'
    WHEN ntile(3) over(ORDER BY ID)=3 THEN 'Label 3'
    ELSE 'OTHER'
  END AS My_Chunks
FROM t

or simple

SELECT 
 ROW_NUMBER() over (ORDER BY ID) AS RowNum,
 id,
 ntile(3) over(ORDER BY ID) AS My_Chunks
FROM t
like image 28
Andrew Avatar answered Sep 22 '22 09:09

Andrew