Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery SQL running totals

Any idea how to calculate running total in BigQuery SQL?

id   value   running total
--   -----   -------------
1    1       1
2    2       3
3    4       7
4    7       14
5    9       23
6    12      35
7    13      48
8    16      64
9    22      86
10   42      128
11   57      185
12   58      243
13   59      302
14   60      362 

Not a problem for traditional SQL servers using either correlated scalar query:

SELECT a.id, a.value, (SELECT SUM(b.value)
                       FROM RunTotalTestData b
                       WHERE b.id <= a.id)
FROM   RunTotalTestData a
ORDER BY a.id;

or join:

SELECT a.id, a.value, SUM(b.Value)
FROM   RunTotalTestData a,
       RunTotalTestData b
WHERE b.id <= a.id
GROUP BY a.id, a.value
ORDER BY a.id;

But I couldn't find a way to make it work in BigQuery...

like image 802
Sasa Avatar asked Feb 02 '13 17:02

Sasa


1 Answers

2018 update: The query in the original question works without modification now.

#standardSQL
WITH RunTotalTestData AS (
  SELECT * FROM UNNEST([STRUCT(1 AS id, 1 AS value),(2,0),(3,1),(4,1),(5,2),(6,3)]) 
)

SELECT a.id, a.value, (SELECT SUM(b.value)
                       FROM RunTotalTestData b
                       WHERE b.id <= a.id) runningTotal
FROM   RunTotalTestData a
ORDER BY a.id;

enter image description here

2013 update: You can use SUM() OVER() to calculate running totals.

In your example:

SELECT id, value, SUM(value) OVER(ORDER BY id)
FROM [your.table]

A working example:

SELECT word, word_count, SUM(word_count) OVER(ORDER BY word)
FROM [publicdata:samples.shakespeare]
WHERE corpus  = 'hamlet'
AND word > 'a' LIMIT 30;
like image 175
Felipe Hoffa Avatar answered Oct 13 '22 19:10

Felipe Hoffa