Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

google bigquery select from a timestamp column between now and n days ago

I have a dataset in bigquery with a TIMESTAMP column "register_date" (sample value "2017-11-19 22:45:05.000 UTC" ).

I need to filter records based on x days or weeks before today criteria. Example query select all records which are 2 weeks old.

Currently I have this query (which I feel like a kind of hack) that works and returns the correct results

SELECT * FROM `my-pj.my_dataset.sample_table`
WHERE
 (SELECT
  CAST(DATE(register_date) AS DATE)) BETWEEN DATE_ADD(CURRENT_DATE(), INTERVAL -150 DAY)
  AND CURRENT_DATE()
LIMIT 10

My question is do I have to use all that CASTing stuff on a TIMESTAMP column (which seems like over complicating the otherwise simple query)? If I remove the CASting part, my query doesn't run and returns error.

Here is my simplified query

SELECT
  *
FROM
  `my-pj.my_dataset.sample_table`
WHERE
  register_date BETWEEN DATE_ADD(CURRENT_DATE(), INTERVAL -150 DAY)
  AND CURRENT_DATE()
LIMIT
  10

that results into an error

Query Failed
Error: No matching signature for operator BETWEEN for argument types: TIMESTAMP, DATE, DATE. Supported signature: (ANY) BETWEEN (ANY) AND (ANY) at [6:17]

any insight is highly appreciated.

like image 605
Waku-2 Avatar asked Jan 26 '18 09:01

Waku-2


People also ask

How do you subtract days from a date in BigQuery?

Subtracting a specific amount of days, weeks, months, quarters, or years from a date can be done using the DATE_SUB function. The first argument takes a date and the second argument takes an interval, a numeric value, and a unit. The supported units (DATE_PART) are: Days: DAY.

What is interval in BigQuery?

Description. Constructs an INTERVAL object using INT64 values representing the year, month, day, hour, minute, and second. All arguments are optional with default value of 0 and can be used as named arguments.


1 Answers

Use timestamp functions:

SELECT t.*
FROM `my-pj.my_dataset.sample_table` t
WHERE register_date BETWEEN TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -150 DAY) AND CURRENT_TIMESTAMP()
LIMIT 10;

BigQuery has three data types for date/time values: date, datetime, and timestamp. These are not mutually interchangeable. The basic idea is:

  • Dates have no time component and no timezone.
  • Datetimes have a time component and no timezone.
  • Timestamp has both a time component and a timezone. In fact, it represents the value in UTC.

INTERVAL values are defined in gcp documentation

Conversion between the different values is not automatic. Your error message suggests that register_date is really stored as a Timestamp.

One caveat (from personal experience): the definition of day is based on UTC. This is not much of an issue if you are in London. It can be a bigger issue if you are in another time zone and you want the definition of "day" to be based on the local time zone. If that is an issue for you, ask another question.

like image 80
Gordon Linoff Avatar answered Sep 19 '22 11:09

Gordon Linoff