Creating a public dataset (or: split storage costs and compute costs across two projects)

Tags:

google-bigquery

I would like to use BigQuery to host datasets that others can query without incurring processing charges against my project. I understand that when I upload a dataset to a project, the storage costs are associated with the project. I want others to be able to discover my dataset, access it via their project/account (preferably without my intervention), and run as many queries on it as they choose to pay for. So, storage costs would go to me, but compute costs would go to those who run the queries.

Is there a way to do this in BigQuery? I asked this via the Google Cloud enterprise sales web form but did not get a response.

745

asked Feb 19 '13 21:02

loren

1 Answers

Absolutely! You can certainly make a dataset public to be queried from other projects, or even share your dataset only with a specific domain, group or user.

In this model, users would be charged for queries to their own Project IDs, while your project covers the storage costs of the datasets. Note that if the users running queries in a different project want to store their resulting tables from their query results, they would of course pay for this storage themselves.

BigQuery currently doesn't provide a mechanism for public dataset discovery. You would have to share the details of your project's public dataset(s) yourself. The GitHub Archive project has a good example of this.

190

answered Oct 04 '22 04:10

Michael Manoochehri

Related questions
                            
                                Discrepancies on "active users metric" between Firebase Analytics dashboard and BigQuery export
                            
                                Best way to loop through parameters in Airflow?
                            
                                Is there a metadata table to check if the table in BigQuery is partitioned?
                            
                                What are the pros and cons of loading data directly into Google BigQuery vs going through Cloud Storage first?
                            
                                Migrate csv from gcs to postgresql
                            
                                BigQuery - Transfers automation from Google Cloud Storage - Overwrite table
                            
                                Is there a way around casting large integers as string when querying data from BigQuery through R?
                            
                                Dealing with evolving schemas
                            
                                How to load compressed files into BigQuery
                            
                                How can I apply aggregate functions to data extracted from JSON in Google BigQuery?
                            
                                Add column description to BiqQuery table?
                            
                                New BigQuery pricing 'tiers'
                            
                                How bq query can get 10000 rows?
                            
                                How to use BigQuery Standard SQL in Dataflow?
                            
                                NOT IN not working in google BigQuery standard sql
                            
                                I use to_gbq on pandas for updating Google BigQuery and get GenericGBQException
                            
                                Reverse- geocoding: How to determine the city closest to a (lat,lon) with BigQuery SQL?
                            
                                BigQuery - using SQL UDF in join predicate
                            
                                Workaround for multiple rollups
                            
                                doing a group by in google Bigquery

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With