How to apply aggregate function only on contiguous rows?

Tags:

On PostgreSQL 9.4, I'm trying to achieve something I'll call "aggregate function" on some contiguous rows. Example:

Input data:

recipe  prod1   prod2   timestamp
0       5       4       2015-07-02 08:10:34.357
0       2       7       2015-07-02 08:13:45.352
0       7       0       2015-07-02 08:16:22.098
1       3       2       2015-07-02 08:22:14.678
1       9       4       2015-07-02 08:22:56.123
2       2       6       2015-07-02 08:26:37.564
2       1       7       2015-07-02 08:27:33.109
2       0       8       2015-07-02 08:31:11.687
0       3       5       2015-07-02 08:40:01.345
1       4       2       2015-07-02 08:42:23.210

Desired output:

recipe  prod1_sum   prod2_avg   timestamp_first             timestamp_last
0       14          3.6666      2015-07-02 08:10:34.357     2015-07-02 08:16:22.098
1       12          3           2015-07-02 08:22:14.678     2015-07-02 08:22:56.123
2       3           7           2015-07-02 08:26:37.564     2015-07-02 08:31:11.687
0       3           5           2015-07-02 08:40:01.345     2015-07-02 08:40:01.345
1       4           2           2015-07-02 08:42:23.210     2015-07-02 08:42:23.210

Basically, one output line for each "group" of contiguous rows (when table is sorted on timestamp column) with the same "recipe" value. In the output, prod1_sum is the sum of prod1 in the "group", prod2_avg is average of prod2 in the same "group", and the 2 last columns are respectively the first and the last timestamps in the group. Obviously there are several distinct groups with the same "recipe" value, and I do want an output row for each of them.

At the moment, I've an ugly way of obtaining this based on several requests and a lot of data processing outside of the DB, which I really want to avoid, and which is not worth showing.

My problem really is the "grouping" of rows. I know how to create an aggregate function which would do what I want, if I could apply it to each group individually. I have looked into windows functions, but it seems this would group all values by recipe, not conforming with the "contiguous rows" principle I need to respect.

743

asked Jul 02 '15 06:07

Pierre-Yves

1 Answers

You can use the following query:

SELECT recipe, SUM(prod1) AS prod1_sum,
       AVG(prod2) AS prod2_avg, 
       MIN(timestamp) AS timestamp_first, MAX(timestamp) AS timestamp_last
FROM (       
   SELECT recipe, prod1, prod2, timestamp,
          ROW_NUMBER() OVER (ORDER BY timestamp) 
          - 
          ROW_NUMBER() OVER (PARTITION BY recipe 
                             ORDER BY timestamp) AS grp
   FROM mytable ) t
GROUP BY recipe, grp
ORDER BY timestamp_first

The trick here is the usage of ROW_NUMBER window function to identify islands of continuous recipe values: grp calculated field does exactly this.

Demo here

130

answered Sep 30 '22 22:09

Giorgos Betsos

Related questions
                            
                                SQL Server 2012 : extract Regex groups
                            
                                How to concatenate field values with recursive query in postgresql?
                            
                                java.sql.SQLException: No value specified for parameter 2
                            
                                HANA SQLScript "Sleep" Command
                            
                                SQLAlchemy order by function result
                            
                                Removing Duplicated Substrings
                            
                                How to count numbers of row of sql query in joomla and display it?
                            
                                Access ODBC can't pull from SQL table with more than 255 columns
                            
                                Difference between using or not using CONSTRAINT keyword on SQL Server
                            
                                Set value of column based on another column in Postgres?
                            
                                Storing password hash - varchar vs varbinary
                            
                                Doctrine2 , get entities order by count of a many to many property
                            
                                Impute via fill-forward/LOCF a column over a range of sequential rows in SQL?
                            
                                Count records per month
                            
                                sql server return value with pyodbc
                            
                                SQL Injection in FROM clause with SqlBuilder [closed]
                            
                                Selecting Min/Max from Comma Separated Values against each record
                            
                                Apache Dbutils changing column name in update Sql
                            
                                Memory consumption of having a column LONGTEXT in MySQL database
                            
                                why i am getting "Inconsistent ending lines" warning window while executing sql script?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to apply aggregate function only on contiguous rows?

Tags:

sql

postgresql

aggregate-functions

window-functions

Pierre-Yves

People also ask

1 Answers

Giorgos Betsos

Recent Activity

Donate For Us