Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sum every 3 rows of a table

I have the following query to count all data every minute.

$sql= "SELECT COUNT(*) AS count, date_trunc('minute', date) AS momento
FROM p WHERE fk_id_b=$id_b GROUP BY date_trunc('minute', date) 
ORDER BY momento ASC";

What I need to do is get the sum of the count for each row with the count of the 2 past minutes.

For example with the result of the $sql query above
|-------date---------|----count----|
|2012-06-21 05:20:00 |      12     |
|2012-06-21 05:21:00 |      14     |
|2012-06-21 05:22:00 |      10     |
|2012-06-21 05:23:00 |      20     |
|2012-06-21 05:24:00 |      25     |
|2012-06-21 05:25:00 |      30     |
|2012-06-21 05:26:00 |      10     |  

I want this result:

|-------date---------|----count----|
|2012-06-21 05:20:00 |      12     |   
|2012-06-21 05:21:00 |      26     |     12+14
|2012-06-21 05:22:00 |      36     |     12+14+10
|2012-06-21 05:23:00 |      44     |     14+10+20
|2012-06-21 05:24:00 |      55     |     10+20+25
|2012-06-21 05:25:00 |      75     |     20+25+30
|2012-06-21 05:26:00 |      65     |     25+30+10
like image 981
Pablo Tapia Avatar asked Jun 29 '12 20:06

Pablo Tapia


3 Answers

This is not so tricky with lag() window function (also on SQL Fiddle):

CREATE TABLE t ("date" timestamptz, "count" int4);
INSERT INTO t VALUES
('2012-06-21 05:20:00',12),
('2012-06-21 05:21:00',14),
('2012-06-21 05:22:00',10),
('2012-06-21 05:23:00',20),
('2012-06-21 05:24:00',25),
('2012-06-21 05:25:00',30),
('2012-06-21 05:26:00',10);

SELECT *,
    "count"
    + coalesce(lag("count", 1) OVER (ORDER BY "date"), 0)
    + coalesce(lag("count", 2) OVER (ORDER BY "date"), 0) AS "total"
  FROM t;
  1. I've double-quoted date and count columns, as these are reserved words;
  2. lag(field, distance) gives me the value of the field column distance rows away from the current one, thus first function gives previous row's value and second call gives the value from the one before;
  3. coalesce() is required to avoid NULL result from lag() function (for the first row in your query there's no “previous” one, thus it's NULL), otherwise the total will also be NULL.
like image 91
vyegorov Avatar answered Sep 29 '22 20:09

vyegorov


Here's a more general solution for the sum of values from current and N previous rows (N=2 in your case).

SELECT "date", 
sum("count") OVER (order by "date" ROWS BETWEEN 2 preceding AND current row)
FROM t
ORDER BY "date";

You can change N between 0 and "Unbounded". This approach gives you a chance to have a parameter in your app "count of the N past minutes". Also, no need for handling default values if out of bounds.

You can find more on this in PostgreSQL docs (4.2.8. Window Function Calls)

like image 38
2 revs, 2 users 87% Avatar answered Sep 29 '22 22:09

2 revs, 2 users 87%


@vyegorov's answer covers it mostly. But I have more gripes than fit into a comment.

  1. Don't use reserved words like date and count as identifiers at all. PostgreSQL allows those two particular key words as identifier - other than every SQL standard. But it's still bad practice. The fact that you can use anything inside double-quotes as identifier, even "; DELETE FROM tbl;" does not make it a good idea. The name "date" for a timestamp is misleading on top of that.

  2. Wrong data type. Example displays timestamp, not timestamptz. Does not make a difference here, but still misleading.

  3. You don't need COALESCE(). With the window functions lag() and lead() you can can provide a default value as 3rd parameter:

Building on this setup:

CREATE TABLE tbl (ts timestamp, ct int4);
INSERT INTO tbl VALUES
  ('2012-06-21 05:20:00', 12)
, ('2012-06-21 05:21:00', 14)
, ('2012-06-21 05:22:00', 10)
, ('2012-06-21 05:23:00', 20)
, ('2012-06-21 05:24:00', 25)
, ('2012-06-21 05:25:00', 30)
, ('2012-06-21 05:26:00', 10);

Query:

SELECT ts, ct + lag(ct, 1, 0) OVER (ORDER BY ts)
              + lag(ct, 2, 0) OVER (ORDER BY ts) AS total
FROM   tbl;

Or better yet: use a single sum() as window aggregate function with a custom window frame:

SELECT ts, sum(ct) OVER (ORDER BY ts ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM   tbl;

Same result.
Related:

  • Group by end of period instead of start date
like image 23
Erwin Brandstetter Avatar answered Sep 29 '22 22:09

Erwin Brandstetter