Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a metadata table to check if the table in BigQuery is partitioned?

I have the following requirement.

We have 1400 sharded tables in BQ based on specific functional grouping. Maximum of these individual sharded tables are also time-partitioned for DAY.

Smaller tables are not time-partitioned.

I am trying to create views dynamically in BigQuery by doing a UNION of all the sharded tables and writing a WHERE filter condition on _partitiontime.

But the problem is there could be smaller tables which are not time-partitioned, the query fails.

There are many more tables created on daily basis and I cant settle for a static exclusion solution since that would require manually maintaining a file with table names etc. (Apache Beam + BQ uses the data that is coming in and creates new sharded tables for newer functional groups without manual intervention.)

I am trying to exclude the non time-partitioned tables by using bq utility to connect to BQ more than 1000 times to check if a table is time-partitioned.

bq show --format=prettyjson and check the timepartitioning type field.

This is very slow, more than 30 minutes.

I have tried the __TABLES_SUMMARY__, but it does not have the partitioning info.

I have also checked SELECT partition_id from [mydataset.table1$__PARTITIONS_SUMMARY__];, but this works if the table is already partitioned.

like image 509
Pruthviraj Shivanna Avatar asked Jul 03 '17 08:07

Pruthviraj Shivanna


People also ask

How can I tell if a table is partitioned?

You can check if table is partitioned in SQL Server with following query. select distinct pp. [object_id], TbName = OBJECT_NAME(pp. [object_id]), index_name = i.

How do I get table metadata in BigQuery?

To retrieve table metadata by using INFORMATION_SCHEMA tables, you will need to have any of the following Identity and Access Management (IAM) roles that give you the necessary permissions: roles/bigquery. admin.

How would you query specific partitions in a BigQuery table?

If you want to query data based on a time zone other than UTC, choose one of the following options: Adjust for time zone differences in your SQL queries. Use partition decorators to load data into specific ingestion-time partitions, based on a different time zone than UTC.

What is partitioned table in BigQuery?

A partitioned table is a special table that is divided into segments, called partitions, that make it easier to manage and query your data. By dividing a large table into smaller partitions, you can improve query performance, and you can control costs by reducing the number of bytes read by a query.


1 Answers

You can write the below query in standard SQL in BQ to get the information

SELECT * 
  FROM `project_id.dataset_name.INFORMATION_SCHEMA.COLUMNS`
  WHERE TABLE_NAME = Table_Name 
    AND is_partitioning_column = "YES"
like image 112
Gulshan Avatar answered Oct 23 '22 04:10

Gulshan