Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can BigQuery be used as a primary query engine?

I was wondering if I could get an insight into how feasible it is to use BigQuery as a primary query engine for an analytics tool that we are developing. Our public API will need to realistically perform at minimum hundreds of concurrent SELECT queries using the PHP SDK (on potentially 100M+ rows), but from the current documentation it seems like BigQuery is more geared towards infrequent querying than providing high volume, high load on demand queries.

Some of the businesses listed on the Google website appear to be doing similar things but I have also seen rate limit figures of 20 concurrent requests, which appears to rule out this Use Case for the product?

like image 817
Andy Avatar asked Jun 24 '14 22:06

Andy


2 Answers

I'm glad you asked. Normal BigQuery users are subject to concurrent request rate limits, but there's an option that would suit the exact use case you describe: Reserved capacity.

With reserved capacity, you get your own "separate cluster", not subject to the same limitations, but the ones you define.

Check https://developers.google.com/bigquery/pricing#reserved_cap for more information.

like image 77
Felipe Hoffa Avatar answered Oct 30 '22 22:10

Felipe Hoffa


That's an architectural decision. My personal opinion is: I would NOT consider BigQuery if you are expecting several different users to use the API concurrently. That would be expensive and risky. I think you should have the raw data on Big Query and try to figure out a mechanism to serve the clients in a more efficient way, perhaps using cache or saving some results / snapshots on the datastore or perhaps CloudSQL.

like image 27
DanielViveiros Avatar answered Oct 30 '22 23:10

DanielViveiros