Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

grouping every N values

I have a table like this in PostgreSQL. I want to perform aggregation functions like mean and max for every 16 records based on ID (which is primary key). For example I have to calculate mean value for first 16 records and second 16 records and so on.

+-----+-------------
| ID  |  rainfall  |
+-----+----------- |
|  1  |  110.2     |
|  2  |  56.6      |
|  3  |  65.6      |
|  4  |  75.9      |
+-----+------------
like image 236
f.ashouri Avatar asked Oct 27 '12 10:10

f.ashouri


1 Answers

The 1st approach that comes to mind is to use row_number() to annotate the table, then group by blocks of 16 rows.

SELECT min(id) as first_id, max(id) AS last_id, avg(rainfall) AS avg_this_16
FROM (
  SELECT id, rainfall, row_number() OVER (order by id) AS n
  FROM the_table
) x(id,rainfall,n)
GROUP BY n/16
ORDER BY n/16;

Note that this won't necessarily include 16 samples for the last group.

Alternately you can calculate a running average by using avg() as a window function:

SELECT id, avg(rainfall) OVER (ORDER BY id ROWS 15 PRECEDING)
FROM the_table;

... possibly annotating that with the row number and selecting the ones you want:

SELECT id AS greatest_id_in_group, avg_last_16_inclusive FROM (
  SELECT
    id, 
    avg(rainfall) OVER (ORDER BY id ROWS 15 PRECEDING) AS avg_last_16_inclusive,
    row_number() OVER (ORDER BY id) AS n
  FROM the_table
) x WHERE n % 16 = 0;

This will disregard the last n<16 samples, not returning a row for them.

Note that I'm assuming the IDs aren't guaranteed to be contiguous. If they are gap-less, you can just group by id/16 and avoid the window function.

like image 86
Craig Ringer Avatar answered Sep 28 '22 15:09

Craig Ringer