Bigquery SQL for sliding window aggregate

Tags:

Hi I have a table that looks like this

Date         Customer   Pageviews
2014/03/01   abc          5
2014/03/02   xyz          8
2014/03/03   abc          6

I want to get page view aggregates grouped by week but showing aggregates for past 30 days - (sliding window aggregates with window-size of 30 days for every week)

I am using google bigquery

EDIT: Gordon - re your comment about "Customer", Actually what I need is slightly more complicated thats why I included customer in the table above. I am looking to get the number of customers who had >n pageviews in a 30day window every week. something like this

Date        Customers>10 pageviews in 30day window
2014/02/01  10
2014/02/08  5
2014/02/15  6
2014/02/22  15

However to keep it simple, I will work my way if I could just get a sliding window aggregate of pageviews ignoring customers altogether. something like this

Date        count of pageviews in 30day window
2014/02/01  50
2014/02/08  55
2014/02/15  65
2014/02/22  75

650

asked Mar 14 '14 21:03

prat

2 Answers

How about this:

SELECT changes + changes1 + changes2 + changes3 changes28days, login, USEC_TO_TIMESTAMP(week)
FROM (
  SELECT changes,
         LAG(changes, 1) OVER (PARTITION BY login ORDER BY week) changes1,
         LAG(changes, 2) OVER (PARTITION BY login ORDER BY week) changes2,
         LAG(changes, 3) OVER (PARTITION BY login ORDER BY week) changes3,
         login,
         week
  FROM (
    SELECT SUM(payload_pull_request_changed_files) changes, 
           UTC_USEC_TO_WEEK(created_at, 1) week,
           actor_attributes_login login,
    FROM [publicdata:samples.github_timeline]
    WHERE payload_pull_request_changed_files > 0
    GROUP BY week, login
))
HAVING changes28days > 0

For each user it counts how many changes they have submitted per week. Then with LAG() we can peek into the next row, how many changes they submitted the -1, -2, and -3 week. Then we just add those 4 weeks to see how many changes were submitted on the last 28 days.

Now you can wrap everything in a new query to filter users with changes>X, and count them.

155

answered Sep 26 '22 07:09

Felipe Hoffa

I have created the following "Times" table:

Table Details: Dim_Periods
Schema
Date    TIMESTAMP   
Year    INTEGER         
Month   INTEGER         
day         INTEGER         
QUARTER INTEGER     
DAYOFWEEK   INTEGER     
MonthStart  TIMESTAMP   
MonthEnd    TIMESTAMP   
WeekStart   TIMESTAMP   
WeekEnd TIMESTAMP   
Back30Days  TIMESTAMP   -- the date 30 days before "Date"
Back7Days   TIMESTAMP   -- the date 7 days before "Date"

and I use such query to handle "running sums"

SELECT Date,Count(*) as MovingCNT
FROM

(SELECT Date,
                Back7Days 
                    FROM DWH.Dim_Periods  
                 where Date < timestamp(current_date()) AND
                             Date >= (DATE_ADD (CURRENT_TIMESTAMP(), -5, 'month'))
                )P
                CROSS JOIN EACH
    (SELECT repository_url,repository_created_at
    FROM publicdata:samples.github_timeline
                ) L
        WHERE timestamp(repository_created_at)>= Back7Days 
              AND timestamp(repository_created_at)<= Date

GROUP EACH BY Date

Note that it can be used for "Month to date", Week to Date" "30 days back" etc. aggregations as well. However, performance is not the best and the query can take a while on larger data sets due to the Cartesian join. Hope this helps

answered Sep 23 '22 07:09

N.N.

Related questions
                            
                                Django: Bulk operations
                            
                                How to add ROW_NUMBER() in a view?
                            
                                Get MAX from a GROUP BY
                            
                                Why does Slick generate a subquery when take() method is called
                            
                                Cannot find table v$parameter in Oracle
                            
                                Extract specific value from CLOB (containing XML) while creating one delimited string per row of a table. CLOB value may be null on some rows
                            
                                Convert SQL query with JOIN ON to SQLAlchemy
                            
                                Prepared statement with dynamic where clause
                            
                                Change primary key (id) of a row in a table and shift the others downwards
                            
                                XML Oracle : Multiple Child Node extract
                            
                                How can you `UPDATE` an SQL Server table without triggering the UPDATE trigger
                            
                                Sql question marks in insert statement
                            
                                Why subquery does not work in teradata?
                            
                                SQL query for counting records per month
                            
                                One select for multiple records by composite key
                            
                                Out of memory exception in SQL Server 2012
                            
                                How to use the result of a select statement as a column of another select statement?
                            
                                Django - Filter a queryset by Max(date) year
                            
                                Entity Framework: Get all rows from the table for the ids in list [duplicate]
                            
                                MySQL - Exclude all rows from one table if match on another table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Bigquery SQL for sliding window aggregate

Tags:

sql

aggregate-functions

moving-average

sliding-window

google-bigquery

prat

People also ask

2 Answers

Felipe Hoffa

N.N.

Recent Activity

Donate For Us