Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group by hour in Google Bigquery

My Google Bigquery table have date and time column. New data are posted in table every 10 minutes, so date field would have e.g. "2018-10-26" and time field "19:05:00". Next record would be like "2018-10-26" and "19:15:00" for date and time field. How to aggregate data for each day by one hour (24 records per day)?

SQL request is sent from Google Sheets using Apps Script. Here is part of google bigquery.gs script: (complete script in GitHub)

...
var sheet = SpreadsheetApp.getActiveSheet();

var sql = 'SELECT  date, time, SUM(col1) AS Col1, SUM(col2) AS Col2 GROUP BY 
time, date ORDER BY time ASC';

var queryResults; 

// Inserts a Query Job
try {
var queryRequest = BigQuery.newQueryRequest();
queryRequest.setQuery(sql).setTimeoutMs(100000);
queryResults = BigQuery.Jobs.query(queryRequest, projectNumber);
}
....
like image 584
Intars_K Avatar asked Oct 26 '18 20:10

Intars_K


People also ask

What is the time travel duration in BigQuery?

You can set the duration of the time travel window, from a minimum of two days to a maximum of seven days. Seven days is the default.

What is the difference between datetime and timestamp in BigQuery?

Datetime type: comprises both calendar date and time. It does not store time zone information: YYYY-MM-DD HH:MM:SS (e.g. ). Timestamp type: comprises date, time, and time zone information.

How do you add a timestamp in BigQuery?

If user want to save a timestamp when a request reaching the BigQuery stream system, user could add a TIMESTAMP column in the table schema, and assign "AUTO" in your json payload. With "AUTO" key words, the system will set the column value to be the wall timestamp on behalf of the user.


1 Answers

Below is for BigQuery Standard SQL

#standardSQL
SELECT date, TIME_TRUNC(time, HOUR) hour, SUM(col1) AS Col1, SUM(col2) AS Col2 
FROM `project.dataset.table`
GROUP BY date, hour 
ORDER BY date, hour

You can test, play with above using dummy data in your question:

#standardSQL
WITH `project.dataset.table` AS (
  SELECT DATE "2018-10-26" date, TIME "19:05:00" time, 1 col1, 2 col2 UNION ALL
  SELECT "2018-10-26", "19:15:00", 3, 4
) 
SELECT date, TIME_TRUNC(time, HOUR) hour, SUM(col1) AS Col1, SUM(col2) AS Col2 
FROM `project.dataset.table`
GROUP BY date, hour 
ORDER BY date, hour  

with result

Row date        hour        Col1    Col2     
1   2018-10-26  19:00:00    4       6    
like image 75
Mikhail Berlyant Avatar answered Sep 23 '22 14:09

Mikhail Berlyant