Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use google bigquery to build histogram graph

How can write a query that makes histogram graph rendering easier?

For example, we have 100 million people with ages, we want to draw the histogram/buckets for age 0-10, 11-20, 21-30 etc... What does the query look like?

Has anyone done it? Did you try to connect the query result to google spreadsheet to draw the histogram?

like image 473
Tom Fishman Avatar asked Mar 17 '13 18:03

Tom Fishman


People also ask

What is BigQuery not good for?

However, despite its unique advantages and powerful features, BigQuery is not a silver bullet. It is not recommended to use it on data that changes too often and, due to its storage location bound to Google's own services and processing limitations it's best not to use it as a primary data storage.

What can I do with Google BigQuery?

BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence.

Is BigQuery good for transactional data?

BigQuery is not a transactional database As you can see it takes 1.6 second to run such a simple query on a 88.2 KB table with 481 rows. You need to remember that it's gonna be worst if you access it on NodeJS (2s) or PHP(6s). So, yeah, good luck on finding such a patient client.

Is BigQuery better than SQL?

Google BigQuery is a cloud-based Architecture and provides exceptional performance as it can auto-scale up and down based on the data load and performs data analysis efficiently. On the other hand, SQL Server is based on client-server architecture and has fixed performance throughout unless the user scales it manually.


1 Answers

See the 2019 update, with #standardSQL --Fh


The subquery idea works, as does "CASE WHEN" and then doing a group by:

SELECT COUNT(field1), bucket 
FROM (
    SELECT field1, CASE WHEN age >=  0 AND age < 10 THEN 1
                        WHEN age >= 10 AND age < 20 THEN 2
                        WHEN age >= 20 AND age < 30 THEN 3
                        ...
                        ELSE -1 END as bucket
    FROM table1) 
GROUP BY bucket

Alternately, if the buckets are regular -- you could just divide and cast to an integer:

SELECT COUNT(field1), bucket 
FROM (
    SELECT field1, INTEGER(age / 10) as bucket FROM table1)
GROUP BY bucket
like image 200
Jordan Tigani Avatar answered Oct 24 '22 23:10

Jordan Tigani