I have multiple datasets, each with hundreds of tables in Google BigQuery. I'd like to remove some old, legacy data and I am looking for the most convenient way to know how much storage space my each dataset and table is occupying, so I could make educated decision on what datasets/tables I may remove.
I tried to use bq
command-line tool but couldn't find a way to display table storage and entire dataset storage related information.
You can access metadata about the tables in a dataset by using the TABLES meta-table. I.e., and example:
select * from [publicdata:samples.__TABLES__]
returns
project_id dataset_id table_id creation_time last_modified_time row_count size_bytes type
publicdata samples github_nested 1348782587310 1348782587310 2541639 1694950811 1
publicdata samples github_timeline 1335915950690 1335915950690 6219749 3801936185 1
publicdata samples gsod 1335916040125 1440625349328 14420316 17290009238 1
publicdata samples natality 1335916045005 1440625330604 37826763 23562717384 1
publicdata samples shakespeare 1335916045099 1440625429551 164656 6432064 1
publicdata samples trigrams 1335916127449 1445684180324 68051509 277168458677 1
publicdata samples wikipedia 1335916132870 1445689914564 13797035 38324173849 1
More documentation here: https://cloud.google.com/bigquery/querying-data
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With