Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Histogram and distribution in Google Data Studio

I have a large number of records showing call duration from various areas and exchange. the table has 3 key fields such as below:

Calling_number area call_duration(s)

I need to plot call distribution based duration in bucket of 1 second or 5 second with data studio.

Although there is no specific function in Data studio to create histogram with large number of bins I have seen someone had done it before.

Any help or suggestion on how to do this would be appreciated.

like image 356
busheriff Avatar asked Oct 28 '25 05:10

busheriff


1 Answers

You can use RANGE_BUCKET mathematical function available in BigQuery. This function scans through a sorted array and returns the 0-based position of a given point's upper bound. This function can be useful if you need to group your data to build partitions, histograms, business-defined rules, and more.

Very simple example for age histogram below

WITH students AS
(
  SELECT 9 AS age UNION ALL
  SELECT 20 AS age UNION ALL
  SELECT 25 AS age UNION ALL
  SELECT 31 AS age UNION ALL
  SELECT 32 AS age UNION ALL
  SELECT 33 AS age
)
SELECT RANGE_BUCKET(age, [10, 20, 30]) AS age_group, COUNT(*) AS count
FROM students
GROUP BY 1

with output

+--------------+-------+
| age_group    | count |
+--------------+-------+
| 0            | 1     |
| 2            | 2     |
| 3            | 3     |
+--------------+-------+
like image 108
Mikhail Berlyant Avatar answered Oct 31 '25 08:10

Mikhail Berlyant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!