Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to determine or specify what geo region BigQuery stores data in?

Is there a way to determine what region (like these) BigQuery is storing my data in? More to the point, is there a way to specify where my data gets stored when sent into BigQuery? If it matters, I'm using both the POST method for bulk loading data and streaming as well.

If the answer to both of these is "no", where does BQ store data? Is it just in the USA, elsewhere...or is it spread all over the globe?

like image 807
Jon Chase Avatar asked May 11 '15 14:05

Jon Chase


People also ask

Which BigQuery feature leverages geography data types and standard SQL geography functions to analyze a data set?

Geospatial analytics let you analyze geographic data in BigQuery.

What are the different ways to access the BigQuery cloud datawarehouse?

You can access BigQuery by using the GCP console or the classic web UI, by using a command-line tool, or by making calls to BigQuery Rest API using a variety of Client Libraries such as Java, and . Net, or Python.

Does BigQuery separate compute from storage?

One of the key features of BigQuery's architecture is the separation of storage and compute. This allows BigQuery to scale both storage and compute independently, based on demand.

Which of the below data storage system on which BigQuery is based of?

Storage is Colossus, Google's global storage system. BigQuery leverages the columnar storage format and compression algorithm to store data in Colossus, optimized for reading large amounts of structured data.


1 Answers

Note: Everything in this post should be considered a guideline and not a guarantee. When in doubt, refer to the BigQuery terms-of-service, which will spell out in more detail about what is guaranteed with respect to data location.

By default, BigQuery stores your data in us-central1 and us-central2. If you want your BigQuery data to be close to your computation (i.e. GCE), you should move your computation to one of those regions.

BigQuery location information is on the dataset. There are three possible values, currently: US, EU, and unspecified. If it is US, the data is located in the US (us-central1 and us-central2), EU, the data is located in the EU (europe-west1, although additional replicas may be stored elsewhere in the EU). If it is unspecified, it is currently equivalent to storing it in the US.

You can see this by doing a datasets.get() operation, which you can do with the bq command line client via:

bq --format=prettyjson show publicdata:samples | grep location

Note that by default, the location is empty, which means that the location is unspecified.

Location must be set when the dataset is created; it is also only (for now) a whitelisted set of customers who can set their dataset location.

like image 97
Jordan Tigani Avatar answered Dec 31 '22 18:12

Jordan Tigani