Is there a way to determine what region (like these) BigQuery is storing my data in? More to the point, is there a way to specify where my data gets stored when sent into BigQuery? If it matters, I'm using both the POST method for bulk loading data and streaming as well.
If the answer to both of these is "no", where does BQ store data? Is it just in the USA, elsewhere...or is it spread all over the globe?
Geospatial analytics let you analyze geographic data in BigQuery.
You can access BigQuery by using the GCP console or the classic web UI, by using a command-line tool, or by making calls to BigQuery Rest API using a variety of Client Libraries such as Java, and . Net, or Python.
One of the key features of BigQuery's architecture is the separation of storage and compute. This allows BigQuery to scale both storage and compute independently, based on demand.
Storage is Colossus, Google's global storage system. BigQuery leverages the columnar storage format and compression algorithm to store data in Colossus, optimized for reading large amounts of structured data.
Note: Everything in this post should be considered a guideline and not a guarantee. When in doubt, refer to the BigQuery terms-of-service, which will spell out in more detail about what is guaranteed with respect to data location.
By default, BigQuery stores your data in us-central1
and us-central2
. If you want your BigQuery data to be close to your computation (i.e. GCE), you should move your computation to one of those regions.
BigQuery location information is on the dataset. There are three possible values, currently: US, EU, and unspecified. If it is US, the data is located in the US (us-central1
and us-central2
), EU, the data is located in the EU (europe-west1
, although additional replicas may be stored elsewhere in the EU). If it is unspecified, it is currently equivalent to storing it in the US.
You can see this by doing a datasets.get()
operation, which you can do with the bq
command line client via:
bq --format=prettyjson show publicdata:samples | grep location
Note that by default, the location is empty, which means that the location is unspecified.
Location must be set when the dataset is created; it is also only (for now) a whitelisted set of customers who can set their dataset location.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With