Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery StandardSQL: Last 7 Days using _TABLE_SUFFIX

Question: I want to pull data from multiple Google Analytics sessions tables using _TABLE_SUFFIX, but I want to set the suffix parameters to between "seven days ago" and "one day ago" (i.e. pulling data for the last 7 days)

The current syntax (that doesn't work):

#StandardSQL
SELECT
  date,
  SUM (totals.visits) AS visits
FROM
  `projectname.123456789.ga_sessions_*`
WHERE
  _TABLE_SUFFIX BETWEEN 
  'DATE_ADD(CURRENT_TIMESTAMP(), INTERVAL -7 DAY)' AND 
  'DATE_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY)'
GROUP BY 
   date
ORDER BY 
   date ASC

Which gives me the message "Valid: This query will process 0 B when run." To my eyes, there is no error in the syntax, but BigQuery is unable ot read my date functions and thus unable to suffix them to the ga_sessions_* wildcard

Inspiration:

BigQuery Cookbook has an example for legacySQL that I have been basing this on: (https://support.google.com/analytics/answer/4419694?hl=en#7days)

#LegacySQL
SELECT
  date,
  SUM (totals.visits) AS visits
FROM 
  (TABLE_DATE_RANGE([73156703.ga_sessions_], 
  DATE_ADD(CURRENT_TIMESTAMP(), -7, 'DAY'), 
  DATE_ADD(CURRENT_TIMESTAMP(), -1, 'DAY')))
GROUP BY
  date
ORDER BY
  date ASC 

Things I've tried: (that doesn't work)

Using DATE_SUB instead of DATE_ADD and using CURRENT_DATE instead of CURRENT_TIMESTAMP:

WHERE
   _TABLE_SUFFIX BETWEEN 
   'DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)' AND 
   'DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)'

Resulting in "Valid: This query will process 0 B when run."

Using DATE_FORMAT around DATE_SUB and CURRENT_DATE in order to get the dates without dashes:

WHERE
   _TABLE_SUFFIX BETWEEN 
   'FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY))' AND 
   'FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))'

Resulting in "Valid: This query will process 0 B when run."

Tried skippingt he hyphens '' around the DATE_SUB clause

WHERE
  _TABLE_SUFFIX BETWEEN 
  DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) AND
  DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)

Resulting in the following error message "Error: No matching signature for operator BETWEEN for argument types: STRING, DATE, DATE. Supported signature: (ANY) BETWEEN (ANY) AND (ANY)"

Thanks in advance,

like image 320
Malex Berg Avatar asked Jun 27 '17 01:06

Malex Berg


People also ask

How do you subtract days from a date in BigQuery?

Subtracting a specific amount of days, weeks, months, quarters, or years from a date can be done using the DATE_SUB function. The first argument takes a date and the second argument takes an interval, a numeric value, and a unit. The supported units (DATE_PART) are: Days: DAY.

What is _table_suffix in BigQuery?

FROM. `bigquery-public-data.noaa_gsod.gsod*` Each row in the wildcard table contains a special column, _TABLE_SUFFIX , which contains the value matched by the wildcard character. For information on wildcard table syntax, see Wildcard tables in the standard SQL reference.

How do you subtract two dates in BigQuery?

The DATE_DIFF function allows you to find the difference between 2 date objects in the specified date_part interval.

How can I get last day of previous month in BigQuery?

Use LAST_DAY function SELECT LAST_DAY(DATE'2021-05-20'); Result: 2021-05-31. This function can also be used to find out the last day of the quarter or the year. SELECT LAST_DAY(DATE'2021-05-20', quarter); SELECT LAST_DAY(DATE'2021-05-20', year);


3 Answers

Elliott's answer is correct, but if you want to get the most performance out of BigQuery for such kind of query, instead of converting _TABLESUFFIX to DATE, you should convert CURRENT_DATE expressions to strings:

WHERE
  _TABLE_SUFFIX BETWEEN 
  FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND
  FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
like image 116
Mosha Pasumansky Avatar answered Sep 29 '22 03:09

Mosha Pasumansky


You were almost there with your last attempt. You need to convert your string to a date in order to use it in the comparison:

WHERE
  PARSE_DATE('%Y%m%d', _TABLE_SUFFIX) BETWEEN 
  DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) AND
  DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
like image 24
Elliott Brossard Avatar answered Sep 29 '22 04:09

Elliott Brossard


This works for anyone just looking to segment out the last week of data in big query. Works for any data set as long as you have a timestamp!

where TIMESTAMPFIELD >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
like image 40
Will Avatar answered Sep 29 '22 02:09

Will