I'm playing with BigQuery's Standard SQL Wildcard Table functionality.
I have a table with the following schema:
_PARTITIONTIME TIMESTAMP NULLABLE
stamp TIMESTAMP NULLABLE
value FLOAT NULLABLE
source STRING NULLABLE
...
Because the tables in my dataset have mixed schemas and their names are not commonly prefixed, I intend on making a wide Wildcard match - then, narrowing _TABLE_SUFFIX
in the WHERE
:
SELECT
*,
_TABLE_SUFFIX AS table_name
FROM `my-project.my-dataset.*`
WHERE REGEXP_CONTAINS(_TABLE_SUFFIX, r"cars")
This is as-basic as I can make it. cars
only matches a single table.
I intend on adding additional tables to the REGEX above. e.g: cars|trucks|...
When I execute the above query, I get the following error:
Cannot read field 'stamp' of type TIMESTAMP_MICROS as DATETIME
Furthermore, when I attempt to remove the stamp
from my SELECT
and use the source
column in it's place:
SELECT
source,
_TABLE_SUFFIX AS table_name
FROM `my-project.my-dataset.*`
WHERE REGEXP_CONTAINS(_TABLE_SUFFIX, r"cars")
BigQuery now tells me: Unrecognized name: source at [2:1]
What's happening here in both cases? Why does BigQuery want to cast my stamp
column, and why is it unable to find source
?
What's happening here in both cases? Why does BigQuery want to cast my stamp column, and why is it unable to find source ?
In order to execute a standard SQL query that uses a wildcard table, BigQuery automatically infers the schema for that table. BigQuery uses the schema for the most recently created table that matches the wildcard as the schema for the wildcard table. If the schema is inconsistent across tables matched by the wildcard table, BigQuery returns an error.
This explains both case you see!
In first case - the most recent table has _PARTITIONTIME as DATETIME data type
In second case - same most recent table just simply does not have source
field in it
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With