I would like to use BigQuery to host datasets that others can query without incurring processing charges against my project. I understand that when I upload a dataset to a project, the storage costs are associated with the project. I want others to be able to discover my dataset, access it via their project/account (preferably without my intervention), and run as many queries on it as they choose to pay for. So, storage costs would go to me, but compute costs would go to those who run the queries.
Is there a way to do this in BigQuery? I asked this via the Google Cloud enterprise sales web form but did not get a response.
One of the key features of BigQuery's architecture is the separation of storage and compute. This allows BigQuery to scale both storage and compute independently, based on demand.
A dataset is contained within a specific project. Datasets are top-level containers that are used to organize and control access to your tables and views. A table or view must belong to a dataset, so you need to create at least one dataset before loading data into BigQuery.
When you create a dataset in BigQuery, the dataset name must be unique for each project. The dataset name can contain the following: Up to 1,024 characters. Letters (uppercase or lowercase), numbers, and underscores.
So in this article, let's look at some cost optimization practices for BigQuery — a serverless and multi-clouded data warehouse. According to Google, BigQuery is already a cost-effective data warehouse compared to other cloud-based platforms.
Absolutely! You can certainly make a dataset public to be queried from other projects, or even share your dataset only with a specific domain, group or user.
In this model, users would be charged for queries to their own Project IDs, while your project covers the storage costs of the datasets. Note that if the users running queries in a different project want to store their resulting tables from their query results, they would of course pay for this storage themselves.
BigQuery currently doesn't provide a mechanism for public dataset discovery. You would have to share the details of your project's public dataset(s) yourself. The GitHub Archive project has a good example of this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With