Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating running sum over partition in BigQuery

I'm trying to calculate a running sum over a partition. This seems easier and quicker than the method suggested in BigQuery SQL running totals.

For example:

SELECT corpus,corpus_date,word_count, sum(word_count) over (partition by corpus,corpus_date order by word_count,word DESC) as running_sum FROM [publicdata:samples.shakespeare]

I'm facing 2 problems:

  1. I'm unable to let the sum start with the most common word (word with highest word_count). Setting DESC or ASC just doesn't change anything, and the sum starts with the least common word(s). If I change the order by to include only "order by word_count" than the running sum isn't correct since rows with the same order (== same word_count) yield the same running sum.

  2. In a similar query I'm executing (see below), the first row of the running sum yields a sum of 0, although the field I sum upon isn't 0 for the first row. Why does this happen? How can I workaround the problem to show the correct running sum? The query is:

select * from
(SELECT
mongo_id,
account_id,
event_date,
trx_amount_sum_per_day,
SUM (trx_amount_sum_per_day) OVER (PARTITION BY mongo_id,account_id ORDER BY event_date DESC) AS running_sum,
ROW_NUMBER() OVER (PARTITION BY mongo_id,account_id ORDER BY event_date DESC) AS row_num
FROM [xs-polar-gasket-4:publicdataset.publictable]
) order by event_date desc

like image 939
Lior Avatar asked Nov 27 '13 23:11

Lior


1 Answers

For question 1:

Change:

SELECT
  corpus, corpus_date, word_count, SUM(word_count)
OVER
  (PARTITION BY corpus, corpus_date
  ORDER BY word_count, word DESC) AS running_sum
FROM [publicdata:samples.shakespeare]

To:

SELECT
  corpus, corpus_date, word_count, SUM(word_count)
OVER
  (PARTITION BY corpus, corpus_date
  ORDER BY word_count DESC, word) AS running_sum
FROM [publicdata:samples.shakespeare]

(Original query is sorting by word, but you wanted to sort by word_count)

like image 164
Felipe Hoffa Avatar answered Sep 20 '22 10:09

Felipe Hoffa